Chemometrics, Machine Learning and Near Infrared Hyperspectral Imaging in Detection of Fraud in Extra-Virgin Olive Oil: A Comparative Case with FTIR, UV-Vis Spectroscopy, Raman, and GC-MS

Author

Derick Malavi

Introduction

Extra-Virgin Olive Oil (EVOO) is highly valued for its quality and health benefits, making it a target for fraud. Adulteration with cheaper oils undermines both quality and consumer trust. Traditional detection methods such as Gas Chromatography (GC-MS) are effective but come with drawbacks: they are destructive, time-consuming, and costly.

Hyperspectral Imaging (HSI) integrates imaging and spectroscopy to capture detailed spectral data across a range of wavelengths. Near-Infrared HSI (NIR-HSI), in particular, is non-destructive and offers both spatial and spectral information, making it ideal for detecting subtle adulteration in EVOO. By capturing a comprehensive spectral fingerprint for each pixel, NIR-HSI can identify even minor adulteration Malavi et al., 2023.

Combining NIR-HSI with machine learning further enhances detection accuracy and efficiency. Machine learning algorithms can analyze the complex data sets generated by NIR-HSI, identifying patterns and anomalies indicative of adulteration. This combination provides a powerful, non-destructive, and efficient method for ensuring the authenticity of EVOO.

Objectives

  • Conduct Data Analysis and Classification: Perform K-Means clustering for unsupervised classification and use Principal Component Analysis (PCA) for dimensionality reduction and exploratory data analysis of various oils.

  • Develop and Validate Machine Learning Models: Create and evaluate supervised machine learning models to classify pure and adulterated olive oil, integrating data from Near-Infrared Hyperspectral Imaging (NIR-HSI) and other analytical techniques such as FTIR, UV-Vis, Raman, and GC-MS.

  • Compare and Assess Analytical Methods: Systematically compare the efficacy of NIR-HSI with other methods (FTIR, UV-Vis, Raman, GC-MS) in detecting EVOO adulteration.

  • Support Scientific Research: Provide a comprehensive data set (upon request) and reproducible code to facilitate further research and advancements in the field of food fraud detection.

Data

The data set contains pure samples of extra-virgin olive oil and those adulterated/mixed with different cheaper oils, including safflower, corn, sesame, sunflower, canola, and soybean oils, in concentration ranges of 0-20% (m/m). The samples were analyzed by GC-MS, FTIR, Raman, UV-Vis, and HSI to determine if these methods could identify them as genuine or not.

warning = FALSE  
  suppressWarnings(suppressMessages({    
  library(caret)    
  library(ggplot2)    
  library(dplyr)   
  library(readxl)   
  library(readr)    
  library(pls)   
  library(janitor)  
  library(class)   
  library(MLmetrics) 
  library(MLeval)  
  library(themis)  
  library(ROSE) 
  library(parallel)    
  library(doParallel)    
  library(foreach)    
  library(pheatmap)   
  library(mdatools)    
  library(nnet)
  library(e1071) 
  library(FactoMineR)
  library(doSNOW)
  library(MASS)
  library(factoextra)
  library(xgboost)
  library(kernlab)}))

Load and Inspect Hyperspectral Imaging Spectral Data

# Load HSI spectra data
hsi<-read_excel("HSI.xlsx")
#Check dimensions
dim(hsi)
[1] 183 228
#We have 183 observations and 228 variables
# Check a few of the column names
colnames(hsi[,c(1:4)])
[1] "sample_id"    "class_1"      "class_2"      "perc_adulter"
# Check for any missing values
anyNA(hsi)#There are no missing values
[1] FALSE
# Considering that we are conducting a binary classification, we will remove some columns
hsi<-hsi[,-c(1,3)]
table(hsi$class_1)

Adulterated   Pure EVOO 
        144          39 
# Convert class_1 to a factor
hsi$class_1<-as.factor(hsi$class_1)
# There are two classes: The oils are either pure/authentic or adulterated
# Check whether the data is normalized
print(paste('The max value is', max(hsi[,-c(1:4)]), 'and the min value is', min(hsi[,-c(1:4)])))
[1] "The max value is 0.827106 and the min value is 0.016328"

Load and Inspect Raman Spectroscopy Data

#Load Raman spectra data
raman<-read_excel("Raman.xlsx")
# Check dimensions
dim(raman)
[1]  183 1404
# We have 183 observations and 1404 variables
#Bind the data to have the same columns as HSI data
raman<-cbind(hsi[,c(1,2)],raman[,-c(1:3)])
colnames(raman[,c(1:3)])#check whether the changes have been effected
[1] "class_1"      "perc_adulter" "500"         
table(raman$class_1)#Check the class distribution

Adulterated   Pure EVOO 
        144          39 
class(raman$class_1)#ensure class_1 is a factor
[1] "factor"
# Define the normalization function to have values of 0 and 1
min_max_normalize <- function(x) {
  return((x - min(x)) / (max(x) - min(x)))
}
raman<-min_max_normalize(raman[,-c(1:2)])
print(paste('The max value is', max(raman[,-c(1:2)]), 'and the min value is', min(raman[,-c(1:2)])))
[1] "The max value is 1 and the min value is 0"
anyNA(raman)
[1] FALSE
dim(raman)
[1]  183 1401

Load and Inspect FTIR Spectroscopy Data

#Load FTIR spectra data
ftir<-read_excel("FTIR.xlsx")
#Check dimensions
dim(ftir)
[1] 183 919
#We have 183 observations and 919 variables
#Bind the data to have the same columns as HSI data
ftir<-cbind(hsi[,c(1,2)],ftir[,-c(1:3)])
colnames(ftir[,c(1:3)])#check whether the changes have been effected
[1] "class_1"            "perc_adulter"       "470.54590000000002"
table(ftir$class_1)#Check the class distribution

Adulterated   Pure EVOO 
        144          39 
class(ftir$class_1)#ensure class_1 is a factor
[1] "factor"
print(paste('The max value is', max(ftir[,-c(1:2)]), 'and the min value is', min(ftir[,-c(1:2)])))#The data is OK
[1] "The max value is 0.6911867 and the min value is 0"
dim(ftir)
[1] 183 918

Load and Inspect UV-Vis Spectroscopy Data

#Load Uv-Vis spectra data
uv_vis<-read_excel("UVVIS.xlsx")
#Check dimensions
dim(uv_vis)
[1] 183 125
#Bind the data to have the same columns as HSI data
uv_vis<-cbind(hsi[,c(1,2)],uv_vis[,-c(1:4)])
colnames(uv_vis[,c(1:3)])#check whether the changes have been effected
[1] "class_1"      "perc_adulter" "200"         
table(uv_vis$class_1)#Check the class distribution

Adulterated   Pure EVOO 
        144          39 
class(uv_vis$class_1)#ensure class_1 is a factor
[1] "factor"
print(paste('The max value is', max(uv_vis[,-c(1:2)]), 'and the min value is', min(uv_vis[,-c(1:2)])))#The data is OK
[1] "The max value is 3.631 and the min value is -0.002"
dim(uv_vis)
[1] 183 123
# Define the normalization function to have values of 0 and 1
min_max_normalize <- function(x) {
  return((x - min(x)) / (max(x) - min(x)))
}
uv_vis<-min_max_normalize(uv_vis[,-c(1:2)])
print(paste('The max value is', max(uv_vis[,-c(1:2)]), 'and the min value is', min(uv_vis[,-c(1:2)])))
[1] "The max value is 1 and the min value is 0"
anyNA(uv_vis)
[1] FALSE
dim(uv_vis)
[1] 183 121
#There are 183 observations and 121 covariates

Load and Inspect GC-MS Data

#Load GC-MS data
gc_ms<-read_excel("GC_MS.xlsx")
#Check dimensions
dim(gc_ms)
[1] 258   9
gc_ms$class_1<-factor(gc_ms$class_1)#convert to factor
# Define the normalization function to have values of 0 and 1
min_max_normalize <- function(x) {
  return((x - min(x)) / (max(x) - min(x)))
}
gc<-min_max_normalize(gc_ms[,-c(1:2)])
print(paste('The max value is', max(gc), 'and the min value is', min(gc)))
[1] "The max value is 1 and the min value is 0"
anyNA(gc_ms)
[1] FALSE
gc_ms<-cbind(gc_ms[,c(1,2)],gc)

Unsupervised Learning

Part 1: Exploratory Data Analysis by Principal Component Analysis (PCA)

  • PCA is essential for managing the complexity of high-dimensional spectral data from HSI, FTIR, and Raman Spectroscopy. It enhances data analysis by reducing dimensionality, filtering noise, extracting key features, improving computational efficiency, and addressing collinearity, ultimately leading to more effective and insightful scientific investigations.

Run PCA

hsi_pca<-PCA(hsi[,-c(1,2)],graph = F)#HSI data
hsi_pca$eig[1:5,]#extracting the first 5 components' eigenvalues
       eigenvalue percentage of variance cumulative percentage of variance
comp 1 178.859596             79.8480341                          79.84803
comp 2  24.388007             10.8875031                          90.73554
comp 3  15.636533              6.9805950                          97.71613
comp 4   1.991524              0.8890731                          98.60521
comp 5   1.710047              0.7634137                          99.36862
raman_pca<-PCA(raman[,-c(1,2)],graph = F)#Raman data
raman_pca$eig[1:5,]#extract the first 5 components' eigenvalues
       eigenvalue percentage of variance cumulative percentage of variance
comp 1  747.84326              53.455559                          53.45556
comp 2  238.93343              17.078872                          70.53443
comp 3  149.71068              10.701264                          81.23569
comp 4   48.34892               3.455963                          84.69166
comp 5   36.93486               2.640090                          87.33175
ftir_pca<-PCA(ftir[,-c(1,2)],graph = F)#FTIR data
ftir_pca$eig[1:5,]#extract the first 5 components' eigenvalues
       eigenvalue percentage of variance cumulative percentage of variance
comp 1 780.478296             85.2050542                          85.20505
comp 2 100.129563             10.9311750                          96.13623
comp 3  21.637108              2.3621297                          98.49836
comp 4   9.707437              1.0597639                          99.55812
comp 5   1.680537              0.1834647                          99.74159
uv_vis_pca<-PCA(uv_vis[,-c(1,2)],graph = F)#UV-Vis data
uv_vis_pca$eig[1:5,]#extract the first 5 components' eigenvalues
       eigenvalue percentage of variance cumulative percentage of variance
comp 1  88.646063              74.492490                          74.49249
comp 2  13.948443              11.721381                          86.21387
comp 3   6.639466               5.579384                          91.79325
comp 4   4.772881               4.010824                          95.80408
comp 5   2.219621               1.865228                          97.66931
gc_ms_pca<-PCA(gc_ms[,-c(1,2)],graph = F)#GC-MS data
gc_ms_pca$eig[1:5,]
       eigenvalue percentage of variance cumulative percentage of variance
comp 1  2.7496514              39.280734                          39.28073
comp 2  1.2952854              18.504077                          57.78481
comp 3  1.1811070              16.872957                          74.65777
comp 4  0.8510186              12.157408                          86.81518
comp 5  0.4893181               6.990258                          93.80543

Scree Plots

HSI

s1<-fviz_eig(hsi_pca, addlabels = TRUE, ylim = c(0, 90),xlim=c(1,5), main = 'HSI Scree Plot',ylab = '% Varaiance', barfill = "chocolate1",hjust = 0.5,ncp = 9,ggtheme = theme_bw(),xlab = "PCs")+
  theme(plot.title = element_text(hjust = 0.5))+
   theme(panel.grid = element_blank())

Raman

s2<-fviz_eig(raman_pca, addlabels = TRUE, ylim = c(0, 60),xlim=c(1,5),main = 'Raman Scree Plot',ylab = '% Varaiance',barfill = "grey",hjust = 0.5,ncp = 20,ggtheme = theme_bw(),xlab = "PCs")+
  theme(plot.title = element_text(hjust = 0.5))+
   theme(panel.grid = element_blank())

FTIR

s3<-fviz_eig(ftir_pca, addlabels = TRUE, ylim = c(0, 90),xlim=c(1,5),main = 'FTIR Scree Plot',ylab = '% Varaiance',barfill = "lightblue",hjust = 0.5,ncp = 7,ggtheme = theme_bw(),xlab = "PCs")+
  theme(plot.title = element_text(hjust = 0.5))+
   theme(panel.grid = element_blank())

Uv-Vis

s4<-fviz_eig(uv_vis_pca, addlabels = TRUE, ylim = c(0, 90),xlim=c(1,5),main = 'UV-Vis Scree Plot',ylab = '% Varaiance',barfill = "goldenrod4",hjust = 0.5,ncp = 15,ggtheme = theme_bw(),xlab = "PCs")+
  theme(plot.title = element_text(hjust = 0.5))+
  theme(panel.grid = element_blank())
#Patch together the scree plots
gridExtra::grid.arrange(s1,s2, nrow =1)

gridExtra::grid.arrange(s3,s4,nrow=1)

GC-MS

s5<-fviz_eig(gc_ms_pca, addlabels = TRUE, ylim = c(0, 40),main = 'GC-MS',barfill = "#660",ylab = '% Variance', hjust = 0.5,ncp = 8,ggtheme = theme_bw(),xlab = "PCs")+
  theme(plot.title = element_text(hjust = 0.5))+
   theme(panel.grid = element_blank())
s5

As observed from the data, more than 80% of the variation in the data from each of the spectroscopic technique can be explained by the the first 3 PCs.

Visualization of PC Scores

Different principal components will be examined to visualize any patterns from our data. The next step will be to create a new data frame for each technique to be used in subsequent analysis.

#HSI Data
hsi_new<-as.data.frame(hsi_pca$ind$coord) #Extract the PCs
colnames(hsi_new)<-c("PC_1","PC_2", "PC_3","PC_4","PC_5")
hsi_new<-cbind(hsi[,c(1,2)],hsi_new)#Bind with the dependent variables
head(hsi_new)
    class_1 perc_adulter      PC_1      PC_2     PC_3       PC_4        PC_5
1 Pure EVOO            0 -2.187702 -5.702637 5.384698  0.7499500 -1.62406806
2 Pure EVOO            0 -5.348023 -6.098424 5.722411  0.8001908 -1.65044915
3 Pure EVOO            0 -3.704094 -6.939274 4.777033  0.2384026 -0.09020059
4 Pure EVOO            0 -6.071234 -8.480851 2.540647 -1.2018337 -1.19048180
5 Pure EVOO            0 -7.461026 -6.364342 2.719426 -0.8607904 -1.26376980
6 Pure EVOO            0 -7.621317 -7.198176 2.559501 -1.0043913 -1.41078619
#Raman Data
raman_new<-as.data.frame(raman_pca$ind$coord) #Extract the PCs
colnames(raman_new)<-c("PC_1","PC_2", "PC_3","PC_4","PC_5")
raman_new<-cbind(hsi[,c(1,2)],raman_new)#Bind with the dependent variables
head(raman_new)
    class_1 perc_adulter       PC_1       PC_2       PC_3       PC_4       PC_5
1 Pure EVOO            0 -31.855461  -8.273141  2.8751214 -1.7878593 -0.5867908
2 Pure EVOO            0   2.926610 -11.610651  9.2270293 -1.9386635 -3.4599492
3 Pure EVOO            0 -27.807180  -8.646078  2.7069419 -3.5894484 -6.5608871
4 Pure EVOO            0  83.433800 -14.237351 11.9046484  0.1890600 -0.3104792
5 Pure EVOO            0   2.290969 -13.853035 16.0812654  0.8736549 -1.4527735
6 Pure EVOO            0  15.421952  -8.836333 -0.2474466 -4.0856311  0.4005661
#FTIR Data
ftir_new<-as.data.frame(ftir_pca$ind$coord) #Extract the PCs
colnames(ftir_new)<-c("PC_1","PC_2", "PC_3","PC_4","PC_5")
ftir_new<-cbind(hsi[,c(1,2)],ftir_new)#Bind with the dependent variables
head(ftir_new)
    class_1 perc_adulter      PC_1      PC_2       PC_3      PC_4     PC_5
1 Pure EVOO            0 -21.88919 -17.71060 -1.9711847 -4.667742 1.784163
2 Pure EVOO            0 -20.98744 -15.34473 -3.2510690 -4.157185 1.954602
3 Pure EVOO            0 -20.26009 -14.72254 -0.6391069 -3.928866 1.763410
4 Pure EVOO            0 -19.71637 -15.48506 -1.5577696 -3.887572 1.537213
5 Pure EVOO            0 -19.96274 -16.15612 -0.8666744 -4.230593 1.077578
6 Pure EVOO            0 -19.20343 -15.92712  0.8544482 -4.293804 1.555746
#Uv_Vis Data
uvvis_new<-as.data.frame(uv_vis_pca$ind$coord) #Extract the PCs
colnames(uvvis_new)<-c("PC_1","PC_2", "PC_3","PC_4","PC_5")
uvvis_new<-cbind(hsi[,c(1,2)],uvvis_new)#Bind with the dependent variables
head(uvvis_new)
    class_1 perc_adulter      PC_1      PC_2       PC_3       PC_4       PC_5
1 Pure EVOO            0  7.296697 -4.291387  4.4288577  1.9719091 -4.0999403
2 Pure EVOO            0  7.255107 -4.324854  4.4859377  1.9859038 -4.0926554
3 Pure EVOO            0  1.734442 -2.857882 -0.1336213  0.1124462 -0.6712346
4 Pure EVOO            0 14.505716 -2.821646  1.6812895 -0.1713008 -1.0105671
5 Pure EVOO            0 12.117024 -2.788984  1.2772269 -0.2578499 -0.8671703
6 Pure EVOO            0 14.114841 -2.455133  1.3193080 -0.1589796 -0.7689586
#GC-MS Data
gc_new<-as.data.frame(gc_ms_pca$ind$coord) #Extract the PCs
colnames(gc_new)<-c("PC_1","PC_2", "PC_3","PC_4","PC_5")
gc_new<-cbind(gc_ms[,c(1,2)],gc_new)#Bind with the dependent variables

PC Plots

#HSI PC Plot
p1 <- hsi_new %>% 
  ggplot(mapping = aes(x = PC_1, y = PC_2, shape = class_1, color = perc_adulter)) +
  geom_point() + 
  labs(x = "PC1 (79.8%)", y = "PC2 (10.9%)",title = "HSI PC Plot", shape = "Oil type", color = "Percent Adulteration") +
  theme_bw() +
  theme(
    panel.border = element_rect(color = 'black', fill = NA),
    panel.grid = element_blank(),
    axis.text.x = element_text(color = 'black', size = 10),
    axis.text.y = element_text(color = 'black', size = 10), 
    aspect.ratio = 1,
    axis.title.x = element_text(size = 9),
    axis.title.y = element_text(size = 9),
    plot.title = element_text(size = 9, hjust = 0.5),
    legend.title = element_text(size = 8),
    legend.text = element_text(size = 6),
     legend.position = "none") +
  scale_color_gradient(low = "#000000", high = "red") +
  stat_ellipse(aes(group = class_1), 
               level = 0.95, 
               geom = "polygon", alpha = 0.2,
               color = 'black', linewidth = 0.6)
#Raman Plot
p2 <- raman_new %>% 
  ggplot(mapping = aes(x = PC_2, y = PC_3, shape = class_1, color = perc_adulter)) +
  geom_point() + 
  labs(x = "PC2 (17.1%)", y = "PC3 (10.7%)",title = "Raman PC Plot", shape = "Oil type", color = "Percent Adulteration") +
  theme_bw() +
  theme(
    panel.border = element_rect(color = 'black', fill = NA),
    axis.text.x = element_text(color = 'black', size = 10),
    panel.grid = element_blank(),
    axis.text.y = element_text(color = 'black', size = 10), 
    aspect.ratio = 1,
    axis.title.x = element_text(size = 9),
    axis.title.y = element_text(size = 9),
    plot.title = element_text(size = 9, hjust = 0.5),
    legend.title = element_text(size = 7),
    legend.text = element_text(size = 6)) +
  scale_color_gradient(low = "#000000", high = "red") +
  stat_ellipse(aes(group = class_1), 
               level = 0.95, 
               geom = "polygon", alpha = 0.2,
               color = 'black', linewidth = 0.6)
#FTIR Plot
p3 <- ftir_new %>% 
  ggplot(mapping = aes(x = PC_2, y = PC_3, shape = class_1, color = perc_adulter)) +
  geom_point() + 
  labs(x = "PC2 (10.9%)", y = "PC3 (7.0%)",title = "FTIR PC Plot", shape = "Oil type", color = "Percent Adulteration") +
  theme_bw() +
  theme(
    panel.border = element_rect(color = 'black', fill = NA),
    axis.text.x = element_text(color = 'black', size = 10),
    panel.grid = element_blank(),
    axis.text.y = element_text(color = 'black', size = 10), 
    aspect.ratio = 1,
    axis.title.x = element_text(size = 9),
    axis.title.y = element_text(size = 9),
    plot.title = element_text(size = 9, hjust = 0.5),
    legend.title = element_text(size = 7),
    legend.text = element_text(size = 6),
     legend.position = "none") +
  scale_color_gradient(low = "#000000", high = "red") +
  stat_ellipse(aes(group = class_1), 
               level = 0.95, 
               geom = "polygon", alpha = 0.2,
               color = 'black', linewidth = 0.6)
#Uv_Vis Plot
p4 <- uvvis_new%>% 
  ggplot(mapping = aes(x = PC_1, y = PC_2, shape = class_1, color = perc_adulter)) +
  geom_point() + 
  labs(x = "PC1 (74.5%)", y = "PC2 (11.7%)",title = "Uv_Vis PC Plot", shape = "Oil type", color = "Percent Adulteration") +
  theme_bw() +
  theme(
    panel.border = element_rect(color = 'black', fill = NA),
    panel.grid = element_blank(),
    axis.text.x = element_text(color = 'black', size = 10),
    axis.text.y = element_text(color = 'black', size = 10), 
    aspect.ratio = 1,
    axis.title.x = element_text(size = 9),
    axis.title.y = element_text(size = 9),
    plot.title = element_text(size = 9, hjust = 0.5),
    legend.title = element_text(size = 7),
    legend.text = element_text(size = 6)) +
  scale_color_gradient(low = "#000000", high = "red") +
  stat_ellipse(aes(group = class_1), 
               level = 0.95, 
               geom = "polygon", alpha = 0.2,
               color = 'black', linewidth = 0.6)

Patch the PC Plots together

suppressWarnings(suppressMessages(library(gridExtra)))
grid.arrange(p1,p2,p3,p4, nrow = 2)

#GC-MS Plot
p5 <- gc_new %>% 
  ggplot(mapping = aes(x = PC_1, y = PC_2, shape = class_1, color = perc_adulter)) +
  geom_point() + 
  labs(x = "PC1 (39.3%)", y = "PC2 (18.5%)",title = "GC-MS PC Plot", shape = "Oil type", color = "Percent Adulteration") +
  theme_bw() +
  theme(
    panel.border = element_rect(color = 'black', fill = NA),
    panel.grid = element_blank(),
    axis.text.x = element_text(color = 'black', size = 10),
    axis.text.y = element_text(color = 'black', size = 10), 
    aspect.ratio = 1,
    axis.title.x = element_text(size = 9),
    axis.title.y = element_text(size = 9),
    plot.title = element_text(size = 9, hjust = 0.5),
    legend.title = element_text(size = 7),
    legend.text = element_text(size = 6)) +
  scale_color_gradient(low = "#000000", high = "red") +
  stat_ellipse(aes(group = class_1), 
               level = 0.95, 
               geom = "polygon", alpha = 0.2,
               color = 'black', linewidth = 0.6)
#Display plot
p5

PCA results indicate interesting patterns. Although the separation does not appear to be very clear, authentic olive oil tends to separate from adulterated olive oils, especially with HSI, UV-Vis, and GC-MS. PCA, however, demonstrates weakness in discerning oils adulterated at different levels, hence the need for additional supervised algorithms.

PC variable contributions

This section investigates the contribution of different variables to the variation in principal components (PCs). By analyzing the loadings of each variable on the principal components, we can determine which variables have the most significant impact on the observed patterns in the data. This analysis helps to identify key features that drive the separation of samples in the PCA plot.

#HSI
h1<-fviz_contrib(hsi_pca, choice = "var", top = 5,axes = 1, sort.val = 'desc', fill = "olivedrab")
h2<-fviz_contrib(hsi_pca, choice = "var", top = 5,axes = 2, sort.val = 'desc', fill = "cadetblue")
grid.arrange(h1,h2, nrow = 1)

#Raman
r1<-fviz_contrib(raman_pca, choice = "var", top = 5,axes = 1, sort.val = 'desc', fill = "#E7B800")
r2<-fviz_contrib(raman_pca, choice = "var", top = 5,axes = 2, sort.val = 'desc', fill = "#00AFBB")
grid.arrange(r1,r2, nrow = 1)

#FTIR
f1<-fviz_contrib(ftir_pca, choice = "var", top = 5,axes = 1, sort.val = 'desc', fill = "tan4")
f2<-fviz_contrib(ftir_pca, choice = "var", top = 5,axes = 2, sort.val = 'desc', fill = "cornflowerblue")
grid.arrange(f1,f2, nrow = 1)

#Uv-Vis
uv1<-fviz_contrib(uv_vis_pca, choice = "var", top = 5,axes = 1, sort.val = 'desc', fill = "thistle")
uv2<-fviz_contrib(uv_vis_pca, choice = "var", top = 5,axes = 2, sort.val = 'desc', fill = "powderblue")
grid.arrange(uv1,uv2, nrow = 1)

#GC-MS
gc1<-fviz_contrib(gc_ms_pca, choice = "var", top = 5,axes = 1, sort.val = 'desc', fill = "chocolate")
gc2<-fviz_contrib(gc_ms_pca, choice = "var", top = 5,axes = 2, sort.val = 'desc', fill = "gray58")
grid.arrange(gc1,gc2, nrow = 1)

Part 2: KMeans Clustering

#Let us find the number of clusters based on silhoutte method


# optimal number of clusters for HSI
hsi_clust<-fviz_nbclust(hsi[,-c(1:2)], kmeans, method = "silhouette", k.max=10)
print(hsi_clust)

# optimal number of clusters for Raman
raman_clust<-fviz_nbclust(raman, kmeans, method = "silhouette", k.max=10)
print(raman_clust)

#optimal number of clusters for FTIR
ftir_clust<-fviz_nbclust(ftir[,-c(1,2)], kmeans, method = "silhouette", k.max=10)
print(ftir_clust)

#optimal number of clusters for UV-Vis
uvvis_clust<-fviz_nbclust(uv_vis, kmeans, method = "silhouette", k.max=10)
plot(uvvis_clust)

#optimal number of clusters for gc-ms
gc_clust<-fviz_nbclust(gc, kmeans, method = "silhouette", k.max=10)
print(gc_clust)

  • The method reveals existence of either 2/3 clusters in our data sets.

  • We will then perform k-means clustering with the optimal number of clusters

# HSI k-means analysis and plots

hsi_kmeans <- kmeans(hsi[,-c(1,2)],2)
cluster<-  hsi_kmeans$cluster
hsi_k_data <-cbind(hsi_new,cluster)
hsi_k_data$cluster<-as.factor(hsi_k_data$cluster)

hsi_k_data %>% 
  ggplot(mapping = aes(x = PC_1, y = PC_2, color = cluster, shape = class_1)) +
  geom_point() + 
  labs(x = "PC1", y = "PC2",title = "HSI k-means cluster Plot")+
  theme_bw() +
  theme(
    panel.border = element_rect(color = 'black', fill = NA),
    panel.grid = element_blank(),
    axis.text.x = element_text(color = 'black', size = 10),
    axis.text.y = element_text(color = 'black', size = 10), 
    aspect.ratio = 1,
    axis.title.x = element_text(size = 9),
    axis.title.y = element_text(size = 9),
    plot.title = element_text(size = 9, hjust = 0.5),
    legend.title = element_text(size = 8),
    legend.text = element_text(size = 6),
    legend.position = "right")+
  scale_color_manual(values = c("blue", "red"))

# Raman k-means analysis and plotting 

raman_kmeans <- kmeans(raman,3)
cluster<-  raman_kmeans$cluster
raman_k_data <-cbind(raman_new,cluster)
raman_k_data$cluster<-as.factor(raman_k_data$cluster)

raman_k_data %>% 
  ggplot(mapping = aes(x = PC_1, y = PC_2, color = cluster, shape = class_1)) +
  geom_point() + 
  labs(x = "PC1", y = "PC2",title = "Raman k-means cluster Plot")+
  theme_bw() +
  theme(
    panel.border = element_rect(color = 'black', fill = NA),
    panel.grid = element_blank(),
    axis.text.x = element_text(color = 'black', size = 10),
    axis.text.y = element_text(color = 'black', size = 10), 
    aspect.ratio = 1,
    axis.title.x = element_text(size = 9),
    axis.title.y = element_text(size = 9),
    plot.title = element_text(size = 9, hjust = 0.5),
    legend.title = element_text(size = 8),
    legend.text = element_text(size = 6),
    legend.position = "right")+
  scale_color_manual(values = c("blue", "red","black"))

# FTIR k-means analysis and plotting

ftir_kmeans <- kmeans(ftir[,-c(1,2)],3)
cluster<-  ftir_kmeans$cluster
ftir_k_data <-cbind(ftir_new,cluster)
ftir_k_data$cluster<-as.factor(ftir_k_data$cluster)

ftir_k_data %>% 
  ggplot(mapping = aes(x = PC_1, y = PC_2, color = cluster, shape = class_1)) +
  geom_point() + 
  labs(x = "PC1", y = "PC2",title = "FTIR k-means cluster Plot")+
  theme_bw() +
  theme(
    panel.border = element_rect(color = 'black', fill = NA),
    panel.grid = element_blank(),
    axis.text.x = element_text(color = 'black', size = 10),
    axis.text.y = element_text(color = 'black', size = 10), 
    aspect.ratio = 1,
    axis.title.x = element_text(size = 9),
    axis.title.y = element_text(size = 9),
    plot.title = element_text(size = 9, hjust = 0.5),
    legend.title = element_text(size = 8),
    legend.text = element_text(size = 6),
    legend.position = "right")+
  scale_color_manual(values = c("blue", "red","black"))

# UV-Vis k-means analysis and plotting

uvvis_kmeans <- kmeans(uv_vis,2)
cluster<-  uvvis_kmeans$cluster
uvvis_k_data <-cbind(uvvis_new,cluster)
uvvis_k_data$cluster<-as.factor(uvvis_k_data$cluster)

uvvis_k_data %>% 
  ggplot(mapping = aes(x = PC_1, y = PC_2, color = cluster, shape = class_1)) +
  geom_point() + 
  labs(x = "PC1", y = "PC2",title = "UV-Vis k-means cluster Plot")+
  theme_bw() +
  theme(
    panel.border = element_rect(color = 'black', fill = NA),
    panel.grid = element_blank(),
    axis.text.x = element_text(color = 'black', size = 10),
    axis.text.y = element_text(color = 'black', size = 10), 
    aspect.ratio = 1,
    axis.title.x = element_text(size = 9),
    axis.title.y = element_text(size = 9),
    plot.title = element_text(size = 9, hjust = 0.5),
    legend.title = element_text(size = 8),
    legend.text = element_text(size = 6),
    legend.position = "right")+
  scale_color_manual(values = c("blue", "red"))

# GC-MS k-means analysis and plotting

gc_kmeans <- kmeans(gc,2)
cluster<-  gc_kmeans$cluster
gc_k_data <-cbind(gc_new,cluster)
gc_k_data$cluster<-as.factor(gc_k_data$cluster)

gc_k_data %>% 
  ggplot(mapping = aes(x = PC_1, y = PC_2, color = cluster, shape = class_1)) +
  geom_point() + 
  labs(x = "PC1", y = "PC2",title = "GC-MS k-means cluster Plot")+
  theme_bw() +
  theme(
    panel.border = element_rect(color = 'black', fill = NA),
    panel.grid = element_blank(),
    axis.text.x = element_text(color = 'black', size = 10),
    axis.text.y = element_text(color = 'black', size = 10), 
    aspect.ratio = 1,
    axis.title.x = element_text(size = 9),
    axis.title.y = element_text(size = 9),
    plot.title = element_text(size = 9, hjust = 0.5),
    legend.title = element_text(size = 8),
    legend.text = element_text(size = 6),
    legend.position = "right")+
  scale_color_manual(values = c("blue", "red"))

  • Although PCA and KMeans clustering show partial group separation, some overlapping clusters persist. This highlights the necessity for supervised learning techniques.

Part 3: Supervised Classification

This section will build, train, and test machine learning models to detect fraud in extra-virgin olive oil. Several ML models, such as LDA, KNN, Random Forest, Decision Trees, Support Vector Machines, and Artificial Neural Networks, will be explored.

  • The model is evaluated using accuracy and the Matthews correlation coefficient (MCC). Accuracy measures the proportion of correct predictions (true positives and true negatives) among all predictions, giving an overall sense of model performance. However, accuracy can be less informative in cases of imbalanced datasets.

  • The MCC provides a more balanced evaluation, particularly in situations with class imbalance. It considers all four categories of predictions: true positives, true negatives, false positives, and false negatives. MCC produces a value between -1 and +1, where +1 indicates perfect predictions, -1 indicates complete disagreement, and 0 suggests random predictions. This makes MCC a more reliable metric for evaluating the model’s performance, especially in skewed data scenarios.

Split data for training the models using cross-validation and testing and customize names/labels

#HSI data
#we will use data reduced by PCA

#set seed for reproducibility
set.seed (123)

hsi_cf<-hsi_new[,-2]
train_index_hsi<-createDataPartition(hsi_cf$class_1,p = 0.7,list=FALSE)

#split the data as train and test set
data_train_hsi_cf<-hsi_cf[train_index_hsi,]
data_test_hsi_cf<-hsi_cf[-train_index_hsi,]

#Check the dimensions of the train and test set for HSI data

dim(data_train_hsi_cf)
[1] 129   6
dim(data_test_hsi_cf)
[1] 54  6
#Change the labels to names

#Train data
levels(data_train_hsi_cf$class_1)<-c("Adulterated", "Pure_EVOO")
levels(data_train_hsi_cf$class_1)<-make.names(levels(data_train_hsi_cf$class_1))
#Test data
levels(data_test_hsi_cf$class_1)<-c("Adulterated", "Pure_EVOO")
levels(data_test_hsi_cf$class_1)<-make.names(levels(data_test_hsi_cf$class_1))

#Confirm the changes to the names have been made and the proportion is OK
table(data_train_hsi_cf$class_1)

Adulterated   Pure_EVOO 
        101          28 
table(data_test_hsi_cf$class_1)

Adulterated   Pure_EVOO 
         43          11 
#Raman data

#set seed for reproducibility
set.seed (123)

raman_cf<-raman_new[,-2]
train_index_raman<-createDataPartition(raman_cf$class_1,p = 0.7,list=FALSE)

#split the data as train and test set
data_train_raman_cf<-raman_cf[train_index_raman,]
data_test_raman_cf<-raman_cf[-train_index_raman,]

#Check the dimensions of the train and test set for raman data

dim(data_train_raman_cf)
[1] 129   6
dim(data_test_raman_cf)
[1] 54  6
#Change the labels to names

#Train data
levels(data_train_raman_cf$class_1)<-c("Adulterated", "Pure_EVOO")
levels(data_train_raman_cf$class_1)<-make.names(levels(data_train_raman_cf$class_1))
#Test data
levels(data_test_raman_cf$class_1)<-c("Adulterated", "Pure_EVOO")
levels(data_test_raman_cf$class_1)<-make.names(levels(data_test_raman_cf$class_1))

#Confirm the changes to the names have been made and the proportion is OK
table(data_train_raman_cf$class_1)

Adulterated   Pure_EVOO 
        101          28 
table(data_test_raman_cf$class_1)

Adulterated   Pure_EVOO 
         43          11 
#FTIR data
#set seed for reproducibility
set.seed (123)

ftir_cf<-ftir_new[,-2]
train_index_ftir<-createDataPartition(ftir_cf$class_1,p = 0.7,list=FALSE)

#split the data as train and test set
data_train_ftir_cf<-ftir_cf[train_index_ftir,]
data_test_ftir_cf<-ftir_cf[-train_index_ftir,]

#Check the dimensions of the train and test set for ftir data

dim(data_train_ftir_cf)
[1] 129   6
dim(data_test_ftir_cf)
[1] 54  6
#Change the labels to names

#Train data
levels(data_train_ftir_cf$class_1)<-c("Adulterated", "Pure_EVOO")
levels(data_train_ftir_cf$class_1)<-make.names(levels(data_train_ftir_cf$class_1))
#Test data
levels(data_test_ftir_cf$class_1)<-c("Adulterated", "Pure_EVOO")
levels(data_test_ftir_cf$class_1)<-make.names(levels(data_test_ftir_cf$class_1))

#Confirm the changes to the names have been made and the proportion is OK
table(data_train_ftir_cf$class_1)

Adulterated   Pure_EVOO 
        101          28 
table(data_test_ftir_cf$class_1)

Adulterated   Pure_EVOO 
         43          11 
#UV-Vis data

#set seed for reproducibility
set.seed (123)

uvvis_cf<-uvvis_new[,-2]
train_index_uvvis<-createDataPartition(uvvis_cf$class_1,p = 0.7,list=FALSE)

#split the data as train and test set
data_train_uvvis_cf<-uvvis_cf[train_index_uvvis,]
data_test_uvvis_cf<-uvvis_cf[-train_index_uvvis,]

#Check the dimensions of the train and test set for uvvis data

dim(data_train_uvvis_cf)
[1] 129   6
dim(data_test_uvvis_cf)
[1] 54  6
#Change the labels to names

#Train data
levels(data_train_uvvis_cf$class_1)<-c("Adulterated", "Pure_EVOO")
levels(data_train_uvvis_cf$class_1)<-make.names(levels(data_train_uvvis_cf$class_1))
#Test data
levels(data_test_uvvis_cf$class_1)<-c("Adulterated", "Pure_EVOO")
levels(data_test_uvvis_cf$class_1)<-make.names(levels(data_test_uvvis_cf$class_1))

#Confirm the changes to the names have been made and the proportion is OK
table(data_train_uvvis_cf$class_1)

Adulterated   Pure_EVOO 
        101          28 
table(data_test_uvvis_cf$class_1)

Adulterated   Pure_EVOO 
         43          11 
#GC-MS data
#set seed for reproducibility
set.seed (123)

gc_cf<-gc_new[,-2]
train_index_gc<-createDataPartition(gc_cf$class_1,p = 0.7,list=FALSE)

#split the data as train and test set
data_train_gc_cf<-gc_cf[train_index_gc,]
data_test_gc_cf<-gc_cf[-train_index_gc,]

#Check the dimensions of the train and test set for gc data

dim(data_train_gc_cf)
[1] 182   6
dim(data_test_gc_cf)
[1] 76  6
#Change the labels to names

#Train data
levels(data_train_gc_cf$class_1)<-c("Adulterated", "Pure_EVOO")
levels(data_train_gc_cf$class_1)<-make.names(levels(data_train_gc_cf$class_1))
#Test data
levels(data_test_gc_cf$class_1)<-c("Adulterated", "Pure_EVOO")
levels(data_test_gc_cf$class_1)<-make.names(levels(data_test_gc_cf$class_1))

#Confirm the changes to the names have been made and the proportion is OK
table(data_train_gc_cf$class_1)

Adulterated   Pure_EVOO 
        152          30 
table(data_test_gc_cf$class_1)

Adulterated   Pure_EVOO 
         64          12 

Set up parameters for training the model using 10 fold repeated 10 times stratified cross-validation.

  • The optimum model will be selected using the one standard error rule (oneSE): selection of a model that is within one standard error of the best-performing model to minimize the risk of overfitting.

  • Additionally, SMOTE will be used to cater for class imbalances.

# Set up the training control(10 folds 10 times cross_validation)
control <- trainControl(method = "repeatedcv", number = 10, repeats = 10, 
                        classProbs = TRUE, savePredictions = "final", 
                        summaryFunction = multiClassSummary,selectionFunction = 'oneSE',sampling = 'smote',allowParallel = TRUE)
#Set the metric as accuracy 
metric<-"Accuracy"
# Detect the number of cores
num_cores <- detectCores()
# Print the number of cores
print(paste("Number of cores available:", num_cores))
[1] "Number of cores available: 8"
#Register cluster for caret to train the models in parallel
cl<-makeCluster(6,type = "SOCK")
suppressWarnings(suppressMessages(
  registerDoSNOW(cl)))

Set up hyper parameters and train different models

Model 1. k-Nearest Neighbors (k-NN)

  • k-Nearest Neighbors (kNN) is a simple and widely used supervised machine learning algorithm for classification and regression tasks. It works by identifying the k closest data points (neighbors) to a given input based on a distance metric (commonly Euclidean distance). The algorithm assigns the majority class (for classification) or computes the average value (for regression) from these neighbors.
cl<-makeCluster(6,type = "SOCK")
suppressWarnings(suppressMessages(
  registerDoSNOW(cl)))
#start_time
start_time<-Sys.time()

#Set up grid for k neighbors
grid_knn<-expand.grid(.k = seq(3,30, by =2))#Ensure k in as odd number to avoid ties on majority voting

#Train k-NN models for all the techniques

#HSI k-NN model
fit_knn_hsi<-train(y=data_train_hsi_cf[,1],x=data_train_hsi_cf[,-1],
                   method = "knn",tuneGrid = grid_knn,trControl = control,metric = metric)

#Raman k-NN model
fit_knn_raman<-train(y=data_train_raman_cf[,1],x=data_train_raman_cf[,-1],
                   method = "knn",tuneGrid = grid_knn,trControl = control,metric = metric)

#FTIR k-NN model
fit_knn_ftir<-train(y=data_train_ftir_cf[,1],x=data_train_ftir_cf[,-1],
                   method = "knn",tuneGrid = grid_knn,trControl = control,metric = metric)

#UV-Vis k-NN model
fit_knn_uvvis<-train(y=data_train_uvvis_cf[,1],x=data_train_uvvis_cf[,-1],
                   method = "knn",tuneGrid = grid_knn,trControl = control,metric = metric)
#GC-MS k-NN model
fit_knn_gc<-train(y=data_train_gc_cf[,1],x=data_train_gc_cf[,-1],
                   method = "knn",tuneGrid = grid_knn,trControl = control,metric = metric)
#End_time

end_time<-Sys.time()
model_training_time<-end_time-start_time
stopCluster(cl)#stop the parallel run cluster
print(paste('The time taken to run the models is',round(model_training_time)))
[1] "The time taken to run the models is 55"
Plot the CV Results for the KNN models
#HSI CV Plot
p1<-ggplot(fit_knn_hsi)+geom_line(colour = "red")+ 
  theme_bw()+
  theme(
    panel.grid = element_blank())+
  labs(title ='HSI k-NN Model Training',y = "Accuracy")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.text = element_text(colour = "black"),
        aspect.ratio = 1)

#Raman CV Plot
p2<-ggplot(fit_knn_raman)+geom_line(colour = "blue")+ 
  theme_bw()+
  theme(
    panel.grid = element_blank())+
  labs(title ='Raman k-NN Model Training',y = "Accuracy")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.text = element_text(colour = "black"),
        aspect.ratio = 1)

#FTIR CV Plot
p3<-ggplot(fit_knn_ftir)+geom_line(colour = "black")+ 
  theme_bw()+
  theme(
    panel.grid = element_blank())+
  labs(title ='FTIR k-NN Model Training', y = "Accuracy")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.text = element_text(colour = "black"),
        aspect.ratio = 1)

#UV-Vis CV Plot
p4<-ggplot(fit_knn_uvvis)+geom_line(colour = "black",linetype = 'dashed')+ 
  theme_bw()+
  theme(
    panel.grid = element_blank())+
  labs(title ='UV-Vis k-NN Model Training',y = "Accuracy")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.text = element_text(colour = "black"),
        aspect.ratio = 1)

#Arrange the knn training model plots

grid.arrange(p1,p2,nrow = 1)

grid.arrange(p3,p4,nrow = 1)

#GC-MS CV Plot
ggplot(fit_knn_gc)+geom_line(colour = "red",lty = 'dashed')+ 
  theme_bw()+
  theme(
  panel.grid = element_blank())+
  labs(title ='GC-MS k-NN Model Training',y = "Accuracy")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.text = element_text(colour = "black"),
        aspect.ratio = 1)

#Display the cross-validation results

#HSI CV results
print(paste('The optimal number of k for training the HSI-kNN model is',fit_knn_hsi$bestTune))
[1] "The optimal number of k for training the HSI-kNN model is 9"
#Output table
knitr::kable(fit_knn_hsi$results)
k logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
3 0.0000000 1 0.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1 1 1.0000000 1 1.0000000 0.7836996 1.0000000 0.0000000 0 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0 0 0.0000000 0 0.0000000 0.0254187 0.0000000
5 0.0000000 1 0.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1 1 1.0000000 1 1.0000000 0.7836996 1.0000000 0.0000000 0 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0 0 0.0000000 0 0.0000000 0.0254187 0.0000000
7 0.0012336 1 0.0039091 1.0000000 1.0000000 1.0000000 1.0000000 1 1 1.0000000 1 1.0000000 0.7836996 1.0000000 0.0045619 0 0.0133349 0.0000000 0.0000000 0.0000000 0.0000000 0 0 0.0000000 0 0.0000000 0.0254187 0.0000000
9 0.0071862 1 0.0213636 1.0000000 1.0000000 1.0000000 1.0000000 1 1 1.0000000 1 1.0000000 0.7836996 1.0000000 0.0126692 0 0.0317328 0.0000000 0.0000000 0.0000000 0.0000000 0 0 0.0000000 0 0.0000000 0.0254187 0.0000000
11 0.0193748 1 0.0583636 0.9992857 0.9981081 0.9995238 0.9990909 1 1 0.9975000 1 0.9990909 0.7829853 0.9995455 0.0214171 0 0.0526679 0.0071429 0.0189189 0.0047619 0.0090909 0 0 0.0250000 0 0.0090909 0.0263481 0.0045455
13 0.0358968 1 0.0901818 0.9976832 0.9936678 0.9984712 0.9970909 1 1 0.9916667 1 0.9970909 0.7813828 0.9985455 0.0304160 0 0.0613512 0.0132676 0.0365052 0.0087467 0.0166419 0 0 0.0481125 0 0.0166419 0.0275347 0.0083209
15 0.0520562 1 0.1034545 0.9953114 0.9872872 0.9968922 0.9940909 1 1 0.9833333 1 0.9940909 0.7790110 0.9970455 0.0400521 0 0.0672157 0.0186797 0.0510074 0.0123714 0.0235215 0 0 0.0670025 0 0.0235215 0.0298743 0.0117607
17 0.0695529 1 0.1202727 0.9907418 0.9759188 0.9937285 0.9882727 1 1 0.9695000 1 0.9882727 0.7744414 0.9941364 0.0502078 0 0.0715860 0.0293205 0.0749368 0.0201185 0.0372772 0 0 0.0938949 0 0.0372772 0.0371660 0.0186386
19 0.0866524 1 0.1351818 0.9820879 0.9536107 0.9878805 0.9772727 1 1 0.9411667 1 0.9772727 0.7657875 0.9886364 0.0590888 0 0.0735395 0.0380929 0.0977569 0.0259355 0.0481653 0 0 0.1223015 0 0.0481653 0.0436107 0.0240826
21 0.1035110 1 0.1450000 0.9604579 0.9025463 0.9725137 0.9495455 1 1 0.8808333 1 0.9495455 0.7441575 0.9747727 0.0664672 0 0.0728749 0.0590417 0.1391052 0.0421831 0.0750810 0 0 0.1633916 0 0.0750810 0.0637649 0.0375405
23 0.1190471 1 0.1485000 0.9419689 0.8614646 0.9589843 0.9258182 1 1 0.8361667 1 0.9258182 0.7256685 0.9629091 0.0731269 0 0.0707492 0.0724456 0.1657948 0.0525605 0.0924815 0 0 0.1883691 0 0.0924815 0.0772899 0.0462408
25 0.1338502 1 0.1495000 0.9147619 0.7999894 0.9390398 0.8910909 1 1 0.7672857 1 0.8910909 0.6984615 0.9455455 0.0798152 0 0.0721768 0.0821280 0.1797836 0.0613849 0.1050362 0 0 0.1964998 0 0.1050362 0.0862489 0.0525181
27 0.1478754 1 0.1530000 0.8844414 0.7353215 0.9156403 0.8520909 1 1 0.6969762 1 0.8520909 0.6681410 0.9260455 0.0861722 0 0.0726389 0.0900817 0.1784939 0.0731246 0.1158011 0 0 0.1812059 0 0.1158011 0.0956831 0.0579005
29 0.1605488 1 0.1530000 0.8726282 0.7131603 0.9061594 0.8372727 1 1 0.6765595 1 0.8372727 0.6563278 0.9186364 0.0918769 0 0.0726389 0.0968875 0.1874523 0.0793904 0.1239060 0 0 0.1860402 0 0.1239060 0.1003890 0.0619530
#The optimal selected model
selected_model<-fit_knn_hsi$results %>% filter(k==as.numeric(fit_knn_hsi$bestTune))
knitr::kable(selected_model)
k logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
9 0.0071862 1 0.0213636 1 1 1 1 1 1 1 1 1 0.7836996 1 0.0126692 0 0.0317328 0 0 0 0 0 0 0 0 0 0.0254187 0
#Raman CV results
print(paste('The optimal number of k for training the Raman-kNN model is',fit_knn_raman$bestTune))
[1] "The optimal number of k for training the Raman-kNN model is 3"
#Output table
knitr::kable(fit_knn_raman$results)
k logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
3 1.2440523 0.9226136 0.1331571 0.9125549 0.7481601 0.9426127 0.9354545 0.8266667 0.9541389 0.8186667 0.9541389 0.9354545 0.7333059 0.8810606 1.4700616 0.0949730 0.0731223 0.0741988 0.2058187 0.0505515 0.0755258 0.2036700 0.0533373 0.1948881 0.0533373 0.0755258 0.0670801 0.1095792
5 0.5199716 0.9514015 0.1862446 0.8701374 0.6602270 0.9108905 0.8773636 0.8400000 0.9553611 0.6979048 0.9553611 0.8773636 0.6878663 0.8586818 0.9031939 0.0783163 0.0954922 0.0894164 0.2184521 0.0654113 0.1027747 0.1980828 0.0545850 0.2055274 0.0545850 0.1027747 0.0864626 0.1101974
7 0.3413938 0.9568258 0.2578748 0.8768132 0.6695588 0.9167968 0.8850909 0.8433333 0.9569848 0.6985000 0.9569848 0.8850909 0.6937729 0.8642121 0.6023510 0.0703798 0.1035314 0.0824153 0.2194412 0.0576156 0.0869431 0.2089738 0.0567629 0.1931791 0.0567629 0.0869431 0.0738878 0.1146752
9 0.2317265 0.9620000 0.3160886 0.8685348 0.6522448 0.9106551 0.8741818 0.8450000 0.9566800 0.6790000 0.9566800 0.8741818 0.6853755 0.8595909 0.1085832 0.0551759 0.1118035 0.0777503 0.1980899 0.0557483 0.0856974 0.2013632 0.0557197 0.1767429 0.0557197 0.0856974 0.0745624 0.1071190
11 0.2533985 0.9628561 0.4000652 0.8676374 0.6527440 0.9093048 0.8731818 0.8450000 0.9568416 0.6810357 0.9568416 0.8731818 0.6844780 0.8590909 0.1058960 0.0543594 0.1260493 0.0867627 0.2163872 0.0643642 0.0982302 0.2068618 0.0567610 0.1914122 0.0567610 0.0982302 0.0821423 0.1130083
13 0.2728867 0.9580455 0.4564920 0.8643864 0.6451527 0.9074641 0.8682727 0.8466667 0.9568737 0.6657381 0.9568737 0.8682727 0.6805769 0.8574697 0.0991097 0.0553698 0.1235958 0.0849880 0.2184281 0.0607094 0.0902408 0.2033391 0.0563495 0.1875985 0.0563495 0.0902408 0.0761114 0.1138807
15 0.2831003 0.9566591 0.5081322 0.8678297 0.6570056 0.9094989 0.8713636 0.8533333 0.9582778 0.6786667 0.9582778 0.8713636 0.6830678 0.8623485 0.0967572 0.0588370 0.1246043 0.0867351 0.2206941 0.0617413 0.0947083 0.1985921 0.0554715 0.1931643 0.0554715 0.0947083 0.0802443 0.1114883
17 0.2958991 0.9484545 0.5503303 0.8707967 0.6596115 0.9119975 0.8780909 0.8416667 0.9550108 0.6911190 0.9550108 0.8780909 0.6884615 0.8598788 0.0988328 0.0679925 0.1272058 0.0893104 0.2257080 0.0639194 0.0938825 0.2070245 0.0580709 0.1999427 0.0580709 0.0938825 0.0804896 0.1174677
19 0.3036249 0.9500227 0.5796020 0.8676374 0.6578032 0.9088308 0.8721818 0.8483333 0.9561984 0.6875833 0.9561984 0.8721818 0.6837637 0.8602576 0.0895371 0.0658273 0.1242053 0.0946691 0.2258989 0.0701760 0.1034655 0.1954806 0.0555802 0.2033098 0.0555802 0.1034655 0.0866840 0.1149814
21 0.3149000 0.9484167 0.5978466 0.8604670 0.6359058 0.9045748 0.8663636 0.8366667 0.9535974 0.6707857 0.9535974 0.8663636 0.6789652 0.8515152 0.0865775 0.0644653 0.1207041 0.0878790 0.2136148 0.0639489 0.0971202 0.2037527 0.0572428 0.1985425 0.0572428 0.0971202 0.0799494 0.1129052
23 0.3256583 0.9447273 0.6282154 0.8680952 0.6505883 0.9106228 0.8791818 0.8250000 0.9507641 0.6961667 0.9507641 0.8791818 0.6889103 0.8520909 0.0824829 0.0696239 0.1091959 0.0909702 0.2287119 0.0644129 0.0966014 0.2056647 0.0575348 0.2185281 0.0575348 0.0966014 0.0789141 0.1176983
25 0.3341599 0.9452121 0.6431570 0.8697436 0.6533543 0.9120767 0.8812727 0.8266667 0.9509628 0.6920000 0.9509628 0.8812727 0.6904945 0.8539697 0.0811680 0.0695675 0.1083981 0.0910362 0.2347824 0.0634202 0.0925907 0.2091080 0.0585728 0.2120061 0.0585728 0.0925907 0.0756768 0.1205276
27 0.3453771 0.9485152 0.6650006 0.8587271 0.6354101 0.9029806 0.8653636 0.8333333 0.9522482 0.6757857 0.9522482 0.8653636 0.6779945 0.8493485 0.0791060 0.0652350 0.1072646 0.0959656 0.2331332 0.0692066 0.1063844 0.2037802 0.0576557 0.2207473 0.0576557 0.1063844 0.0852908 0.1175720
29 0.3601544 0.9450682 0.6645529 0.8541026 0.6185880 0.9002130 0.8633636 0.8183333 0.9487312 0.6627857 0.9487312 0.8633636 0.6764560 0.8408485 0.0757367 0.0699862 0.0967849 0.0910510 0.2228996 0.0662816 0.1027988 0.2106787 0.0587298 0.2097488 0.0587298 0.1027988 0.0826288 0.1160534
#The optimal selected model
selected_model<-fit_knn_raman$results %>% filter(k==as.numeric(fit_knn_raman$bestTune))
knitr::kable(selected_model)
k logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
3 1.244052 0.9226136 0.1331571 0.9125549 0.7481601 0.9426127 0.9354545 0.8266667 0.9541389 0.8186667 0.9541389 0.9354545 0.7333059 0.8810606 1.470062 0.094973 0.0731223 0.0741988 0.2058187 0.0505515 0.0755258 0.20367 0.0533373 0.1948881 0.0533373 0.0755258 0.0670801 0.1095792
#FTIR CV results
print(paste('The optimal number of k for training the FTIR-kNN model is',fit_knn_ftir$bestTune))
[1] "The optimal number of k for training the FTIR-kNN model is 5"
#Output table
knitr::kable(fit_knn_ftir$results)
k logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
3 1.0323693 0.9790682 0.0891260 0.9629579 0.8991604 0.9752703 0.9606364 0.9700000 0.9928030 0.9005000 0.9928030 0.9606364 0.7527381 0.9653182 1.4915185 0.0304849 0.0939109 0.0457350 0.1247937 0.0307973 0.0570742 0.1042853 0.0245394 0.1425174 0.0245394 0.0570742 0.0522905 0.0541923
5 0.8933665 0.9797273 0.1395541 0.9636630 0.9059410 0.9754104 0.9576364 0.9866667 0.9964394 0.8945000 0.9964394 0.9576364 0.7502381 0.9721515 1.4855867 0.0318146 0.1113560 0.0501542 0.1266809 0.0343278 0.0626010 0.0656488 0.0175436 0.1506822 0.0175436 0.0626010 0.0543347 0.0420938
7 0.7204548 0.9818485 0.1785331 0.9498077 0.8685703 0.9658144 0.9448182 0.9650000 0.9916212 0.8671667 0.9916212 0.9448182 0.7403480 0.9549091 1.2620981 0.0289999 0.1195234 0.0626096 0.1615210 0.0434607 0.0756115 0.1143346 0.0268265 0.1732934 0.0268265 0.0756115 0.0651638 0.0672144
9 0.5605292 0.9828864 0.1749770 0.9204945 0.8029630 0.9448317 0.9080909 0.9650000 0.9901692 0.7895000 0.9901692 0.9080909 0.7117491 0.9365455 1.0585117 0.0295163 0.1171403 0.0729802 0.1750837 0.0516909 0.0887710 0.1067187 0.0298665 0.1890471 0.0298665 0.0887710 0.0762842 0.0692240
11 0.5484945 0.9814015 0.1734187 0.9118956 0.7809320 0.9384855 0.8992727 0.9550000 0.9885101 0.7732857 0.9885101 0.8992727 0.7046978 0.9271364 0.9766434 0.0299868 0.1156116 0.0786846 0.1832954 0.0575545 0.0979520 0.1250252 0.0313679 0.1944786 0.0313679 0.0979520 0.0816717 0.0749947
13 0.3365566 0.9851818 0.1777630 0.8916758 0.7380485 0.9232247 0.8744545 0.9500000 0.9872449 0.7293571 0.9872449 0.8744545 0.6852564 0.9122273 0.6780939 0.0278070 0.1147143 0.0886178 0.2020087 0.0660167 0.1102986 0.1329540 0.0333914 0.2073320 0.0333914 0.1102986 0.0905353 0.0836155
15 0.2243062 0.9834470 0.1766718 0.8755128 0.7014784 0.9114931 0.8546364 0.9500000 0.9865101 0.6915238 0.9865101 0.8546364 0.6697985 0.9023182 0.4015166 0.0283997 0.1126839 0.0807769 0.1736652 0.0616360 0.1063906 0.1264645 0.0336972 0.1869614 0.0336972 0.1063906 0.0880706 0.0704500
17 0.2274168 0.9836364 0.1792931 0.8793040 0.7152991 0.9135704 0.8518182 0.9766667 0.9943636 0.6916905 0.9943636 0.8518182 0.6676007 0.9142424 0.4021118 0.0324029 0.1167965 0.0797052 0.1664418 0.0613493 0.1052864 0.0948151 0.0224466 0.1815402 0.0224466 0.1052864 0.0873583 0.0600769
19 0.1814400 0.9831667 0.1844083 0.8840476 0.7324908 0.9164226 0.8518182 1.0000000 1.0000000 0.6933571 1.0000000 0.8518182 0.6676007 0.9259091 0.1294257 0.0321932 0.1136392 0.0828757 0.1737899 0.0633706 0.1052864 0.0000000 0.0000000 0.1816867 0.0000000 0.1052864 0.0873583 0.0526432
21 0.1901237 0.9772727 0.1871508 0.8840476 0.7324908 0.9164226 0.8518182 1.0000000 1.0000000 0.6933571 1.0000000 0.8518182 0.6676007 0.9259091 0.1329789 0.0455998 0.1126461 0.0828757 0.1737899 0.0633706 0.1052864 0.0000000 0.0000000 0.1816867 0.0000000 0.1052864 0.0873583 0.0526432
23 0.1970124 0.9733409 0.1959161 0.8840476 0.7324908 0.9164226 0.8518182 1.0000000 1.0000000 0.6933571 1.0000000 0.8518182 0.6676007 0.9259091 0.1391554 0.0491356 0.1172844 0.0828757 0.1737899 0.0633706 0.1052864 0.0000000 0.0000000 0.1816867 0.0000000 0.1052864 0.0873583 0.0526432
25 0.2038120 0.9658485 0.2081898 0.8840476 0.7324908 0.9164226 0.8518182 1.0000000 1.0000000 0.6933571 1.0000000 0.8518182 0.6676007 0.9259091 0.1432662 0.0539180 0.1193817 0.0828757 0.1737899 0.0633706 0.1052864 0.0000000 0.0000000 0.1816867 0.0000000 0.1052864 0.0873583 0.0526432
27 0.2093675 0.9630833 0.2255691 0.8840476 0.7324908 0.9164226 0.8518182 1.0000000 1.0000000 0.6933571 1.0000000 0.8518182 0.6676007 0.9259091 0.1458288 0.0536117 0.1108393 0.0828757 0.1737899 0.0633706 0.1052864 0.0000000 0.0000000 0.1816867 0.0000000 0.1052864 0.0873583 0.0526432
29 0.2155705 0.9577955 0.2289920 0.8840476 0.7324908 0.9164226 0.8518182 1.0000000 1.0000000 0.6933571 1.0000000 0.8518182 0.6676007 0.9259091 0.1499417 0.0606386 0.1087950 0.0828757 0.1737899 0.0633706 0.1052864 0.0000000 0.0000000 0.1816867 0.0000000 0.1052864 0.0873583 0.0526432
#The optimal selected model
selected_model<-fit_knn_ftir$results %>% filter(k==as.numeric(fit_knn_ftir$bestTune))
knitr::kable(selected_model)
k logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
5 0.8933665 0.9797273 0.1395541 0.963663 0.905941 0.9754104 0.9576364 0.9866667 0.9964394 0.8945 0.9964394 0.9576364 0.7502381 0.9721515 1.485587 0.0318146 0.111356 0.0501542 0.1266809 0.0343278 0.062601 0.0656488 0.0175436 0.1506822 0.0175436 0.062601 0.0543347 0.0420938
#Uv_Vis CV results
print(paste('The optimal number of k for training the UV-Vis-kNN model is',fit_knn_uvvis$bestTune))
[1] "The optimal number of k for training the UV-Vis-kNN model is 7"
#Output table
knitr::kable(fit_knn_uvvis$results)
k logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
3 0.0386068 0.9983333 0.0301923 0.9919872 0.9682393 0.9952814 1 0.9583333 0.9910606 1 0.9910606 1 0.783663 0.9791667 0.2741752 0.0166667 0.0700284 0.0265379 0.1055152 0.0156278 0 0.1368538 0.0294567 0 0.0294567 0 0.0257429 0.0684269
5 0.0430485 0.9983333 0.0401923 0.9904487 0.9633337 0.9943290 1 0.9516667 0.9892424 1 0.9892424 1 0.783663 0.9758333 0.2784509 0.0166667 0.0777488 0.0282224 0.1095893 0.0167345 0 0.1427150 0.0315988 0 0.0315988 0 0.0257429 0.0713575
7 0.0446054 0.9983333 0.0428590 0.9904487 0.9633337 0.9943290 1 0.9516667 0.9892424 1 0.9892424 1 0.783663 0.9758333 0.2763862 0.0166667 0.0784363 0.0282224 0.1095893 0.0167345 0 0.1427150 0.0315988 0 0.0315988 0 0.0257429 0.0713575
9 0.0459053 0.9983333 0.0453590 0.9881410 0.9559752 0.9929004 1 0.9416667 0.9865152 1 0.9865152 1 0.783663 0.9708333 0.2728059 0.0166667 0.0778187 0.0304287 0.1150357 0.0181749 0 0.1505228 0.0343822 0 0.0343822 0 0.0257429 0.0752614
11 0.0476304 0.9983333 0.0453590 0.9842949 0.9437110 0.9905195 1 0.9250000 0.9819697 1 0.9819697 1 0.783663 0.9625000 0.2797878 0.0166667 0.0778187 0.0334315 0.1225932 0.0201241 0 0.1613200 0.0381423 0 0.0381423 0 0.0257429 0.0806600
13 0.0470412 0.9983333 0.0453590 0.9850641 0.9461639 0.9909957 1 0.9283333 0.9828788 1 0.9828788 1 0.783663 0.9641667 0.2779242 0.0166667 0.0778187 0.0328893 0.1212197 0.0197728 0 0.1593600 0.0374651 0 0.0374651 0 0.0257429 0.0796800
15 0.0510874 0.9983333 0.0468135 0.9850641 0.9461639 0.9909957 1 0.9283333 0.9828788 1 0.9828788 1 0.783663 0.9641667 0.2801883 0.0166667 0.0774061 0.0328893 0.1212197 0.0197728 0 0.1593600 0.0374651 0 0.0374651 0 0.0257429 0.0796800
17 0.0527656 0.9983333 0.0538135 0.9842949 0.9437110 0.9905195 1 0.9250000 0.9819697 1 0.9819697 1 0.783663 0.9625000 0.2813498 0.0166667 0.0808877 0.0334315 0.1225932 0.0201241 0 0.1613200 0.0381423 0 0.0381423 0 0.0257429 0.0806600
19 0.0550901 0.9983333 0.0564347 0.9842949 0.9437110 0.9905195 1 0.9250000 0.9819697 1 0.9819697 1 0.783663 0.9625000 0.2828743 0.0166667 0.0826447 0.0334315 0.1225932 0.0201241 0 0.1613200 0.0381423 0 0.0381423 0 0.0257429 0.0806600
21 0.0550807 0.9983333 0.0620559 0.9842949 0.9437110 0.9905195 1 0.9250000 0.9819697 1 0.9819697 1 0.783663 0.9625000 0.2806436 0.0166667 0.0817685 0.0334315 0.1225932 0.0201241 0 0.1613200 0.0381423 0 0.0381423 0 0.0257429 0.0806600
23 0.0563168 0.9983333 0.0690559 0.9842949 0.9437110 0.9905195 1 0.9250000 0.9819697 1 0.9819697 1 0.783663 0.9625000 0.2819735 0.0166667 0.0887075 0.0334315 0.1225932 0.0201241 0 0.1613200 0.0381423 0 0.0381423 0 0.0257429 0.0806600
25 0.0584497 0.9978333 0.0734650 0.9842949 0.9437110 0.9905195 1 0.9250000 0.9819697 1 0.9819697 1 0.783663 0.9625000 0.2885495 0.0169843 0.0889019 0.0334315 0.1225932 0.0201241 0 0.1613200 0.0381423 0 0.0381423 0 0.0257429 0.0806600
27 0.0584218 0.9983333 0.0760105 0.9842949 0.9437110 0.9905195 1 0.9250000 0.9819697 1 0.9819697 1 0.783663 0.9625000 0.2834763 0.0166667 0.0910681 0.0334315 0.1225932 0.0201241 0 0.1613200 0.0381423 0 0.0381423 0 0.0257429 0.0806600
29 0.0570551 0.9980833 0.0785711 0.9842949 0.9437110 0.9905195 1 0.9250000 0.9819697 1 0.9819697 1 0.783663 0.9625000 0.2805573 0.0168281 0.0924979 0.0334315 0.1225932 0.0201241 0 0.1613200 0.0381423 0 0.0381423 0 0.0257429 0.0806600
#The optimal selected model
selected_model<-fit_knn_uvvis$results %>% filter(k==as.numeric(fit_knn_uvvis$bestTune))
knitr::kable(selected_model)
k logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
7 0.0446054 0.9983333 0.042859 0.9904487 0.9633337 0.994329 1 0.9516667 0.9892424 1 0.9892424 1 0.783663 0.9758333 0.2763862 0.0166667 0.0784363 0.0282224 0.1095893 0.0167345 0 0.142715 0.0315988 0 0.0315988 0 0.0257429 0.0713575
#GC-MS CV results
print(paste('The optimal number of k for training the GC-MS-kNN model is',fit_knn_gc$bestTune))
[1] "The optimal number of k for training the GC-MS-kNN model is 3"
#Output table
knitr::kable(fit_knn_gc$results)
k logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
3 1.1781957 0.9535833 0.1193805 0.9324269 0.7945562 0.9572181 0.9296667 0.9466667 0.9891520 0.7660238 0.9891520 0.9296667 0.7763450 0.9381667 1.5696388 0.0751456 0.0877987 0.0626074 0.1778655 0.0408978 0.0664540 0.1316391 0.0273295 0.1868231 0.0273295 0.0664540 0.0554388 0.0781157
5 0.9362806 0.9621250 0.2202883 0.9224561 0.7608615 0.9510592 0.9210417 0.9300000 0.9859804 0.7327024 0.9859804 0.9210417 0.7691520 0.9255208 1.2407607 0.0606882 0.1088330 0.0593946 0.1699763 0.0391227 0.0636617 0.1520160 0.0308188 0.1721365 0.0308188 0.0636617 0.0532472 0.0831265
7 0.7005914 0.9706875 0.2492645 0.9147368 0.7461970 0.9455017 0.9111250 0.9333333 0.9863376 0.7148095 0.9863376 0.9111250 0.7608772 0.9222292 1.0590840 0.0417287 0.1062247 0.0694688 0.1891092 0.0467596 0.0746509 0.1498222 0.0311037 0.1889182 0.0311037 0.0746509 0.0624952 0.0871199
9 0.5943910 0.9726736 0.2931924 0.9125146 0.7468903 0.9436252 0.9024583 0.9633333 0.9923330 0.6973571 0.9923330 0.9024583 0.7536550 0.9328958 0.8837518 0.0400254 0.1065945 0.0647699 0.1622249 0.0445360 0.0731729 0.1048220 0.0220128 0.1655193 0.0220128 0.0731729 0.0614413 0.0662612
11 0.5507476 0.9736458 0.3204781 0.8981287 0.7135036 0.9336807 0.8872083 0.9533333 0.9900326 0.6689286 0.9900326 0.8872083 0.7409357 0.9202708 0.8068333 0.0422458 0.1070067 0.0734556 0.1828167 0.0510942 0.0827807 0.1162450 0.0249260 0.1870346 0.0249260 0.0827807 0.0695922 0.0745285
13 0.5629811 0.9701042 0.3426332 0.8871345 0.6876275 0.9260932 0.8727083 0.9600000 0.9915400 0.6359048 0.9915400 0.8727083 0.7288304 0.9163542 0.8048383 0.0444075 0.1081122 0.0709246 0.1679965 0.0501584 0.0822965 0.1088662 0.0230810 0.1671044 0.0230810 0.0822965 0.0692209 0.0681536
15 0.5575517 0.9718958 0.3752366 0.8766959 0.6639985 0.9186399 0.8615417 0.9533333 0.9899466 0.6144177 0.9899466 0.8615417 0.7195029 0.9074375 0.7910438 0.0457810 0.0993164 0.0796417 0.1866327 0.0574298 0.0893628 0.1341724 0.0292116 0.1743773 0.0292116 0.0893628 0.0750128 0.0838365
17 0.5654109 0.9691736 0.3879049 0.8822222 0.6788662 0.9224513 0.8661667 0.9633333 0.9921401 0.6268452 0.9921401 0.8661667 0.7233626 0.9147500 0.7893084 0.0469051 0.1008465 0.0741825 0.1717596 0.0527941 0.0861544 0.1048220 0.0225559 0.1717766 0.0225559 0.0861544 0.0723446 0.0683921
19 0.5746048 0.9677083 0.4071876 0.8739181 0.6587168 0.9168294 0.8582083 0.9533333 0.9897591 0.6100714 0.9897591 0.8582083 0.7167251 0.9057708 0.7885451 0.0471617 0.0950340 0.0801004 0.1921651 0.0559187 0.0887931 0.1341724 0.0296619 0.1806967 0.0296619 0.0887931 0.0746076 0.0852134
21 0.5652140 0.9668472 0.4287132 0.8777778 0.6699737 0.9191803 0.8608333 0.9633333 0.9921561 0.6186190 0.9921561 0.8608333 0.7189181 0.9120833 0.7747068 0.0470542 0.0901682 0.0777609 0.1789087 0.0555093 0.0898562 0.1150318 0.0247525 0.1744950 0.0247525 0.0898562 0.0755227 0.0737295
23 0.5739903 0.9653194 0.4461870 0.8702632 0.6529785 0.9139687 0.8525417 0.9600000 0.9911785 0.6022262 0.9911785 0.8525417 0.7119591 0.9062708 0.7730142 0.0489143 0.0962882 0.0800987 0.1837186 0.0573254 0.0909289 0.1187288 0.0262048 0.1765100 0.0262048 0.0909289 0.0760853 0.0779127
25 0.5309258 0.9632153 0.4628055 0.8724269 0.6564478 0.9155838 0.8544583 0.9633333 0.9921144 0.6024762 0.9921144 0.8544583 0.7135673 0.9088958 0.7171422 0.0493854 0.1000864 0.0760412 0.1714441 0.0544933 0.0877457 0.1150318 0.0248488 0.1619774 0.0248488 0.0877457 0.0735005 0.0730034
27 0.5224528 0.9630139 0.4832953 0.8659064 0.6439047 0.9109062 0.8466667 0.9633333 0.9920595 0.5920476 0.9920595 0.8466667 0.7070468 0.9050000 0.7026586 0.0511184 0.0984323 0.0779317 0.1771206 0.0554338 0.0901699 0.1150318 0.0249963 0.1698918 0.0249963 0.0901699 0.0754140 0.0735205
29 0.5117543 0.9637569 0.5015714 0.8668129 0.6472936 0.9113940 0.8477083 0.9633333 0.9918587 0.5961190 0.9918587 0.8477083 0.7079532 0.9055208 0.6828549 0.0519591 0.0990304 0.0793248 0.1764620 0.0567925 0.0918757 0.1048220 0.0235049 0.1721001 0.0235049 0.0918757 0.0771838 0.0710863
#The optimal selected model
selected_model<-fit_knn_gc$results %>% filter(k==as.numeric(fit_knn_gc$bestTune))
knitr::kable(selected_model)
k logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
3 1.178196 0.9535833 0.1193805 0.9324269 0.7945562 0.9572181 0.9296667 0.9466667 0.989152 0.7660238 0.989152 0.9296667 0.776345 0.9381667 1.569639 0.0751456 0.0877987 0.0626074 0.1778655 0.0408978 0.066454 0.1316391 0.0273295 0.1868231 0.0273295 0.066454 0.0554388 0.0781157

Testing kNN Classification Models

Hyperspectral Imaging kNN Test Results
#Predict HSI test set
test_hsi_knn<-predict(fit_knn_hsi,newdata=data_test_hsi_cf)

#get the confusion matrix
cfmatrix_hsi<-confusionMatrix(test_hsi_knn,data_test_hsi_cf$class_1)

#print the confusion matrix
print(cfmatrix_hsi)
Confusion Matrix and Statistics

             Reference
Prediction    Adulterated Pure_EVOO
  Adulterated          43         0
  Pure_EVOO             0        11
                                     
               Accuracy : 1          
                 95% CI : (0.934, 1) 
    No Information Rate : 0.7963     
    P-Value [Acc > NIR] : 4.55e-06   
                                     
                  Kappa : 1          
                                     
 Mcnemar's Test P-Value : NA         
                                     
            Sensitivity : 1.0000     
            Specificity : 1.0000     
         Pos Pred Value : 1.0000     
         Neg Pred Value : 1.0000     
             Prevalence : 0.7963     
         Detection Rate : 0.7963     
   Detection Prevalence : 0.7963     
      Balanced Accuracy : 1.0000     
                                     
       'Positive' Class : Adulterated
                                     
knitr::kable(cfmatrix_hsi$byClass)
x
Sensitivity 1.0000000
Specificity 1.0000000
Pos Pred Value 1.0000000
Neg Pred Value 1.0000000
Precision 1.0000000
Recall 1.0000000
F1 1.0000000
Prevalence 0.7962963
Detection Rate 0.7962963
Detection Prevalence 0.7962963
Balanced Accuracy 1.0000000
#View the results as knitr table
knitr::kable(cfmatrix_hsi$table)
Adulterated Pure_EVOO
Adulterated 43 0
Pure_EVOO 0 11
#Calculating Matthews Correlation Coefficient (MC)
# Extracting the components of the confusion matrix
cfmatrix_hsi_knn_table <- cfmatrix_hsi$table
TP <- cfmatrix_hsi_knn_table[1,1] 
TN <- cfmatrix_hsi_knn_table[2,2] 
FP <- cfmatrix_hsi_knn_table[2,1] 
FN <- cfmatrix_hsi_knn_table[1,2] 

# Calculating MCC
MCC <- (TP * TN - FP * FN) / sqrt((TP + FP) * (TP + FN) * (TN + FP) * (TN + FN))

# Printing the MCC value
print(paste("The MCC value  for the model is ",round(MCC,2)))
[1] "The MCC value  for the model is  1"
Raman Spectroscopy kNN Test Results
#Predict Raman test set
test_raman_knn<-predict(fit_knn_raman,newdata=data_test_raman_cf)

#get the confusion matrix
cfmatrix_raman<-confusionMatrix(test_raman_knn,data_test_raman_cf$class_1)

#print the confusion matrix
print(cfmatrix_raman)
Confusion Matrix and Statistics

             Reference
Prediction    Adulterated Pure_EVOO
  Adulterated          38         4
  Pure_EVOO             5         7
                                          
               Accuracy : 0.8333          
                 95% CI : (0.7071, 0.9208)
    No Information Rate : 0.7963          
    P-Value [Acc > NIR] : 0.3154          
                                          
                  Kappa : 0.5031          
                                          
 Mcnemar's Test P-Value : 1.0000          
                                          
            Sensitivity : 0.8837          
            Specificity : 0.6364          
         Pos Pred Value : 0.9048          
         Neg Pred Value : 0.5833          
             Prevalence : 0.7963          
         Detection Rate : 0.7037          
   Detection Prevalence : 0.7778          
      Balanced Accuracy : 0.7600          
                                          
       'Positive' Class : Adulterated     
                                          
knitr::kable(cfmatrix_raman$byClass)
x
Sensitivity 0.8837209
Specificity 0.6363636
Pos Pred Value 0.9047619
Neg Pred Value 0.5833333
Precision 0.9047619
Recall 0.8837209
F1 0.8941176
Prevalence 0.7962963
Detection Rate 0.7037037
Detection Prevalence 0.7777778
Balanced Accuracy 0.7600423
#View the results as knitr table
knitr::kable(cfmatrix_raman$table)
Adulterated Pure_EVOO
Adulterated 38 4
Pure_EVOO 5 7
#Calculating Matthews Correlation Coefficient (MC)
# Extracting the components of the confusion matrix
cfmatrix_raman_knn_table <- cfmatrix_raman$table
TP <- cfmatrix_raman_knn_table[1,1] 
TN <- cfmatrix_raman_knn_table[2,2] 
FP <- cfmatrix_raman_knn_table[2,1] 
FN <- cfmatrix_raman_knn_table[1,2] 

# Calculating MCC
MCC <- (TP * TN - FP * FN) / sqrt((TP + FP) * (TP + FN) * (TN + FP) * (TN + FN))

# Printing the MCC value
print(paste("The MCC value for the model",round(MCC,2)))
[1] "The MCC value for the model 0.5"
FTIR Spectroscopy kNN Test Results
#Predict FTIR test set
test_ftir_knn<-predict(fit_knn_ftir,newdata=data_test_ftir_cf)

#get the confusion matrix
cfmatrix_ftir<-confusionMatrix(test_ftir_knn,data_test_ftir_cf$class_1)

#print the confusion matrix
print(cfmatrix_ftir)
Confusion Matrix and Statistics

             Reference
Prediction    Adulterated Pure_EVOO
  Adulterated          39         0
  Pure_EVOO             4        11
                                          
               Accuracy : 0.9259          
                 95% CI : (0.8211, 0.9794)
    No Information Rate : 0.7963          
    P-Value [Acc > NIR] : 0.008546        
                                          
                  Kappa : 0.7989          
                                          
 Mcnemar's Test P-Value : 0.133614        
                                          
            Sensitivity : 0.9070          
            Specificity : 1.0000          
         Pos Pred Value : 1.0000          
         Neg Pred Value : 0.7333          
             Prevalence : 0.7963          
         Detection Rate : 0.7222          
   Detection Prevalence : 0.7222          
      Balanced Accuracy : 0.9535          
                                          
       'Positive' Class : Adulterated     
                                          
knitr::kable(cfmatrix_ftir$byClass)
x
Sensitivity 0.9069767
Specificity 1.0000000
Pos Pred Value 1.0000000
Neg Pred Value 0.7333333
Precision 1.0000000
Recall 0.9069767
F1 0.9512195
Prevalence 0.7962963
Detection Rate 0.7222222
Detection Prevalence 0.7222222
Balanced Accuracy 0.9534884
#View the results as knitr table
knitr::kable(cfmatrix_ftir$table)
Adulterated Pure_EVOO
Adulterated 39 0
Pure_EVOO 4 11
#Calculating Matthews Correlation Coefficient (MC)
# Extracting the components of the confusion matrix
cfmatrix_ftir_knn_table <- cfmatrix_ftir$table
TP <- cfmatrix_ftir_knn_table[1,1] 
TN <- cfmatrix_ftir_knn_table[2,2] 
FP <- cfmatrix_ftir_knn_table[2,1] 
FN <- cfmatrix_ftir_knn_table[1,2] 

# Calculating MCC
MCC <- (TP * TN - FP * FN) / sqrt((TP + FP) * (TP + FN) * (TN + FP) * (TN + FN))

# Printing the MCC value
print(paste("The MCC value for the model is",round(MCC,2)))
[1] "The MCC value for the model is 0.82"
Uv-Vis Spectroscopy kNN Test Results
#Predict uvvis test set
test_uvvis_knn<-predict(fit_knn_uvvis,newdata=data_test_uvvis_cf)

#get the confusion matrix
cfmatrix_uvvis<-confusionMatrix(test_uvvis_knn,data_test_uvvis_cf$class_1)

#print the confusion matrix
print(cfmatrix_uvvis)
Confusion Matrix and Statistics

             Reference
Prediction    Adulterated Pure_EVOO
  Adulterated          43         2
  Pure_EVOO             0         9
                                          
               Accuracy : 0.963           
                 95% CI : (0.8725, 0.9955)
    No Information Rate : 0.7963          
    P-Value [Acc > NIR] : 0.0004935       
                                          
                  Kappa : 0.8776          
                                          
 Mcnemar's Test P-Value : 0.4795001       
                                          
            Sensitivity : 1.0000          
            Specificity : 0.8182          
         Pos Pred Value : 0.9556          
         Neg Pred Value : 1.0000          
             Prevalence : 0.7963          
         Detection Rate : 0.7963          
   Detection Prevalence : 0.8333          
      Balanced Accuracy : 0.9091          
                                          
       'Positive' Class : Adulterated     
                                          
knitr::kable(cfmatrix_uvvis$byClass)
x
Sensitivity 1.0000000
Specificity 0.8181818
Pos Pred Value 0.9555556
Neg Pred Value 1.0000000
Precision 0.9555556
Recall 1.0000000
F1 0.9772727
Prevalence 0.7962963
Detection Rate 0.7962963
Detection Prevalence 0.8333333
Balanced Accuracy 0.9090909
#View the results as knitr table
knitr::kable(cfmatrix_uvvis$table)
Adulterated Pure_EVOO
Adulterated 43 2
Pure_EVOO 0 9
#Calculating Matthews Correlation Coefficient (MC)
# Extracting the components of the confusion matrix
cfmatrix_uvvis_knn_table <- cfmatrix_uvvis$table
TP <- cfmatrix_uvvis_knn_table[1,1] 
TN <- cfmatrix_uvvis_knn_table[2,2] 
FP <- cfmatrix_uvvis_knn_table[2,1] 
FN <- cfmatrix_uvvis_knn_table[1,2] 

# Calculating MCC
MCC <- (TP * TN - FP * FN) / sqrt((TP + FP) * (TP + FN) * (TN + FP) * (TN + FN))

# Printing the MCC value
print(paste("The MCC value for the model is ",round(MCC,2)))
[1] "The MCC value for the model is  0.88"
GC-MS spectroscopy k-NN Results
#Predict gc test set
test_gc_knn<-predict(fit_knn_gc,newdata=data_test_gc_cf)

#get the confusion matrix
cfmatrix_gc<-confusionMatrix(test_gc_knn,data_test_gc_cf$class_1)

#print the confusion matrix
print(cfmatrix_gc)
Confusion Matrix and Statistics

             Reference
Prediction    Adulterated Pure_EVOO
  Adulterated          58         0
  Pure_EVOO             6        12
                                         
               Accuracy : 0.9211         
                 95% CI : (0.836, 0.9705)
    No Information Rate : 0.8421         
    P-Value [Acc > NIR] : 0.03392        
                                         
                  Kappa : 0.7532         
                                         
 Mcnemar's Test P-Value : 0.04123        
                                         
            Sensitivity : 0.9062         
            Specificity : 1.0000         
         Pos Pred Value : 1.0000         
         Neg Pred Value : 0.6667         
             Prevalence : 0.8421         
         Detection Rate : 0.7632         
   Detection Prevalence : 0.7632         
      Balanced Accuracy : 0.9531         
                                         
       'Positive' Class : Adulterated    
                                         
knitr::kable(cfmatrix_gc$byClass)
x
Sensitivity 0.9062500
Specificity 1.0000000
Pos Pred Value 1.0000000
Neg Pred Value 0.6666667
Precision 1.0000000
Recall 0.9062500
F1 0.9508197
Prevalence 0.8421053
Detection Rate 0.7631579
Detection Prevalence 0.7631579
Balanced Accuracy 0.9531250
#View the results as knitr table
knitr::kable(cfmatrix_gc$table)
Adulterated Pure_EVOO
Adulterated 58 0
Pure_EVOO 6 12
#Calculating Matthews Correlation Coefficient (MC)
# Extracting the components of the confusion matrix
cfmatrix_gc_knn_table <- cfmatrix_gc$table
TP <- cfmatrix_gc_knn_table[1,1] 
TN <- cfmatrix_gc_knn_table[2,2] 
FP <- cfmatrix_gc_knn_table[2,1] 
FN <- cfmatrix_gc_knn_table[1,2] 

# Calculating MCC
MCC <- (TP * TN - FP * FN) / sqrt((TP + FP) * (TP + FN) * (TN + FP) * (TN + FN))

# Printing the MCC value
print(paste("The MCC value for the model is",round(MCC,2)))
[1] "The MCC value for the model is 0.78"
Plot Confusion Matrix Tables for kNN Binary Classifcation Algorithm
# Plotting the confusion matrix
#HSI kNN confusion Matrix
cf_hsi_knn<-ggplot(data = as.data.frame(cfmatrix_hsi$table), aes(x = Reference, y = Prediction)) +
  geom_tile(aes(fill = Freq), color = "black") +
  geom_text(aes(label = sprintf("%0.0f", Freq)), vjust = 1,color = "black",size = 4) +
  scale_fill_gradient(low = "white", high = "#99ccff", name = "Frequency") +
  theme_minimal() +
  labs(x = "Actual Class", y = "Predicted Class", color = 'black',title = 'Confusion Matrix HSI kNN')+
  theme(
    legend.position =  "none",
    axis.text.y = element_text(color = "black",size = 8),
    axis.title.x  = element_text(size = 8),
    axis.title.y = element_text(size = 8),aspect.ratio = 1,
    plot.title = element_text(size = 8,hjust = 0.5))

#Raman kNN confusion Matrix
cf_raman_knn<-ggplot(data = as.data.frame(cfmatrix_raman$table), aes(x = Reference, y = Prediction)) +
  geom_tile(aes(fill = Freq), color = "black") +
  geom_text(aes(label = sprintf("%0.0f", Freq)), vjust = 1,color = "black",size = 4) +
  scale_fill_gradient(low = "gray84", high = "darkorange3", name = "Frequency") +
  theme_minimal() +
  labs(x = "Actual Class", y = "Predicted Class", color = 'black',title = 'Confusion Matrix Raman kNN')+
  theme(
    legend.position =  "none",
    axis.text.y = element_text(color = "black",size = 8),
    axis.title.x  = element_text(size = 8),
    axis.title.y = element_text(size = 8),aspect.ratio = 1,
    plot.title = element_text(size = 8,hjust = 0.5))

#FTIR kNN confusion Matrix
cf_ftir_knn<-ggplot(data = as.data.frame(cfmatrix_ftir$table), aes(x = Reference, y = Prediction)) +
  geom_tile(aes(fill = Freq), color = "black") +
  geom_text(aes(label = sprintf("%0.0f", Freq)), vjust = 1,color = "black",size = 4) +
  scale_fill_gradient(low = "gray84", high = "darkseagreen2", name = "Frequency") +
  theme_minimal() +
  labs(x = "Actual Class", y = "Predicted Class", color = 'black',title = 'Confusion Matrix FTIR kNN')+
  theme(
    legend.position =  "none",
    axis.text.y = element_text(color = "black",size = 8),
    axis.title.x  = element_text(size = 8),
    axis.title.y = element_text(size = 8),aspect.ratio = 1,
    plot.title = element_text(size = 8,hjust = 0.5))

#UV-Vis kNN confusion Matrix
cf_uvvis_knn<-ggplot(data = as.data.frame(cfmatrix_uvvis$table), aes(x = Reference, y = Prediction)) +
  geom_tile(aes(fill = Freq), color = "black") +
  geom_text(aes(label = sprintf("%0.0f", Freq)), vjust = 1,color = "black",size = 4) +
  scale_fill_gradient(low = "azure1", high = "turquoise", name = "Frequency") +
  theme_minimal() +
  labs(x = "Actual Class", y = "Predicted Class", color = 'black',title = 'Confusion Matrix UV-Vis kNN')+
  theme(
    legend.position =  "none",
    axis.text.y = element_text(color = "black",size = 8),
    axis.title.x  = element_text(size = 8),
    axis.title.y = element_text(size = 8),aspect.ratio = 1,
    plot.title = element_text(size = 8,hjust = 0.5))
grid.arrange(cf_hsi_knn,cf_raman_knn,cf_ftir_knn,cf_uvvis_knn,nrow = 2)

#GC-MS kNN confusion Matrix
ggplot(data = as.data.frame(cfmatrix_gc$table), aes(x = Reference, y = Prediction)) +
  geom_tile(aes(fill = Freq), color = "black") +
  geom_text(aes(label = sprintf("%0.0f", Freq)), vjust = 1,color = "black",size = 4) +
  scale_fill_gradient(low = "azure1", high = "tan1", name = "Frequency") +
  theme_minimal() +
  labs(x = "Actual Class", y = "Predicted Class", color = 'black',title = 'Confusion Matrix GC-MS kNN')+
  theme(
    legend.position =  "none",
    axis.text.y = element_text(color = "black",size = 8),
    axis.title.x  = element_text(size = 8),
    axis.title.y = element_text(size = 8),aspect.ratio = 1,
    plot.title = element_text(size = 8,hjust = 0.5))

Model 2. Random Forest (RF)

  • Random Forest (RF) is a versatile and powerful supervised machine learning algorithm used for both classification and regression tasks. It operates by creating a collection of decision trees during training and outputs the majority vote for classification or the average for regression.
#Register cluster for caret to train the models in parallel
cl<-makeCluster(6,type = "SOCK")
suppressWarnings(suppressMessages(
  registerDoSNOW(cl)))

#start_time
start_time<-Sys.time()

#Set up grid for mtry: The number of trees
#We will select the number of trees by tuneLength and cross-validation

#Train RF models for all the techniques

#HSI RF model
fit_rf_hsi<-train(y=data_train_hsi_cf[,1],x=data_train_hsi_cf[,-1],
                   method = "rf",tuneLength = 10,trControl = control,metric = metric)
note: only 4 unique complexity parameters in default grid. Truncating the grid to 4 .
#Raman RF model
fit_rf_raman<-train(y=data_train_raman_cf[,1],x=data_train_raman_cf[,-1],
                   method = "rf",tuneLength = 10,trControl = control,metric = metric)
note: only 4 unique complexity parameters in default grid. Truncating the grid to 4 .
#FTIR RF model
fit_rf_ftir<-train(y=data_train_ftir_cf[,1],x=data_train_ftir_cf[,-1],
                   method = "rf",tuneLength = 10,trControl = control,metric = metric)
note: only 4 unique complexity parameters in default grid. Truncating the grid to 4 .
#UV-Vis RF model
fit_rf_uvvis<-train(y=data_train_uvvis_cf[,1],x=data_train_uvvis_cf[,-1],
                   method = "rf",tuneLength = 10,trControl = control,metric = metric)
note: only 4 unique complexity parameters in default grid. Truncating the grid to 4 .
#GC-MS RF model
fit_rf_gc<-train(y=data_train_gc_cf[,1],x=data_train_gc_cf[,-1],
                   method = "rf",tuneLength = 10,trControl = control,metric = metric)
note: only 4 unique complexity parameters in default grid. Truncating the grid to 4 .
#End_time
end_time<-Sys.time()
model_training_time<-end_time-start_time
print(model_training_time)
Time difference of 43.04118 secs
stopCluster(cl)#stop the parallel run cluster
Plot the RF CV Model Results
#HSI CV Plot
p1<-ggplot(fit_rf_hsi)+geom_line(colour = "red")+ 
  theme_bw()+
  theme(
  panel.grid = element_blank())+
  labs(title ='HSI RF Model Training',y = "Accuracy")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.title = element_text(size =8),
        axis.text = element_text(colour = "black"),
        aspect.ratio = 1)

 #Raman CV Plot
p2<-ggplot(fit_rf_raman)+geom_line(colour = "blue")+ 
  theme_bw()+
  theme(
  panel.grid = element_blank())+
  labs(title ='Raman RF Model Training',y = "Accuracy")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.title = element_text(size =8),
        axis.text = element_text(colour = "black",size = 8),
        aspect.ratio = 1)

#FTIR CV Plot
p3<-ggplot(fit_rf_ftir)+geom_line(colour = "black")+ 
  theme_bw()+
  theme(
  panel.grid = element_blank())+
  labs(title ='FTIR RF Model Training', y = "Accuracy")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.title = element_text(size =8),
        axis.text = element_text(colour = "black",size = 8),
        aspect.ratio = 1)

#UV-Vis CV Plot
p4<-ggplot(fit_rf_uvvis)+geom_line(colour = "black",linetype = 'dashed')+ 
  theme_bw()+
  theme(
  panel.grid = element_blank())+
  labs(title ='UV-Vis RF Model Training',y = "Accuracy")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.title = element_text(size =8),
        axis.text = element_text(colour = "black",size =8),
        aspect.ratio = 1)

#Arrange the RF training model plots

grid.arrange(p1,p2,p3,p4,nrow = 2)

#GC-MS RF CV Plot
ggplot(fit_rf_gc)+geom_line(colour = "black",linetype = 'dashed')+ 
  theme_bw()+
  theme(
  panel.grid = element_blank())+
  labs(title ='GC-MS RF Model Training',y = "Accuracy")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.title = element_text(size =8),
        axis.text = element_text(colour = "black",size =8),
        aspect.ratio = 1)

Display RF cross-validation results
#HSI RF CV results
print(paste('The optimal number of mtry for training the HSI-RF model is',fit_rf_hsi$bestTune))
[1] "The optimal number of mtry for training the HSI-RF model is 2"
#Output HSI RF table
knitr::kable(fit_rf_hsi$results)
mtry logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
2 0.0262941 1 0.5970909 1.0000000 1.0000000 1.0000000 1 1.0000000 1.0000000 1 1.0000000 1 0.783663 1.0000000 0.0162547 0 0.1224532 0.0000000 0.0000000 0.0000000 0 0.0000000 0.0000000 0 0.0000000 0 0.0257429 0.0000000
3 0.0167133 1 0.4224545 0.9992308 0.9975472 0.9995238 1 0.9966667 0.9990909 1 0.9990909 1 0.783663 0.9983333 0.0128262 0 0.1313483 0.0076923 0.0245283 0.0047619 0 0.0333333 0.0090909 0 0.0090909 0 0.0257429 0.0166667
4 0.0127291 1 0.2850303 0.9992308 0.9975472 0.9995238 1 0.9966667 0.9990909 1 0.9990909 1 0.783663 0.9983333 0.0158460 0 0.1397528 0.0076923 0.0245283 0.0047619 0 0.0333333 0.0090909 0 0.0090909 0 0.0257429 0.0166667
5 0.0122280 1 0.1190000 0.9992308 0.9975472 0.9995238 1 0.9966667 0.9990909 1 0.9990909 1 0.783663 0.9983333 0.0242736 0 0.1114280 0.0076923 0.0245283 0.0047619 0 0.0333333 0.0090909 0 0.0090909 0 0.0257429 0.0166667
#The optimal selected model
selected_model<-fit_rf_hsi$results %>% filter(mtry==as.numeric(fit_rf_hsi$bestTune))
knitr::kable(selected_model)
mtry logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
2 0.0262941 1 0.5970909 1 1 1 1 1 1 1 1 1 0.783663 1 0.0162547 0 0.1224532 0 0 0 0 0 0 0 0 0 0.0257429 0
#Raman CV results
print(paste('The optimal number of mtry for training the Raman-RF model is',fit_rf_raman$bestTune))
[1] "The optimal number of mtry for training the Raman-RF model is 2"
#Output RF Raman table
knitr::kable(fit_rf_raman$results)
mtry logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
2 0.1301516 0.9903939 0.6888850 0.9585897 0.8754587 0.9733728 0.9725455 0.910 0.9768485 0.9253333 0.9768485 0.9725455 0.7621703 0.9412727 0.1133321 0.0238073 0.0757046 0.0515578 0.1607524 0.0331805 0.0486392 0.1810618 0.0459291 0.1291533 0.0459291 0.0486392 0.0466757 0.0905398
3 0.1608713 0.9908333 0.6001799 0.9554487 0.8676432 0.9712205 0.9686364 0.910 0.9769242 0.9171667 0.9769242 0.9686364 0.7590293 0.9393182 0.2949836 0.0209640 0.0972382 0.0524829 0.1611227 0.0340038 0.0534224 0.1810618 0.0458223 0.1379556 0.0458223 0.0534224 0.0486668 0.0893799
4 0.3788020 0.9851061 0.4832984 0.9509524 0.8593083 0.9678939 0.9617273 0.915 0.9776061 0.9003333 0.9776061 0.9617273 0.7536996 0.9383636 0.8028832 0.0280052 0.1192321 0.0537771 0.1563819 0.0358503 0.0588349 0.1699921 0.0441274 0.1453156 0.0441274 0.0588349 0.0534357 0.0842899
5 0.5400562 0.9757424 0.2567070 0.9470879 0.8515275 0.9650112 0.9549091 0.920 0.9791818 0.8865000 0.9791818 0.9549091 0.7483608 0.9374545 1.0539974 0.0504069 0.1262966 0.0569790 0.1589086 0.0385299 0.0655545 0.1632306 0.0418683 0.1575039 0.0418683 0.0655545 0.0581537 0.0820042
#The optimal selected model
selected_model<-fit_rf_raman$results %>% filter(mtry==as.numeric(fit_rf_raman$bestTune))
knitr::kable(selected_model)
mtry logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
2 0.1301516 0.9903939 0.688885 0.9585897 0.8754587 0.9733728 0.9725455 0.91 0.9768485 0.9253333 0.9768485 0.9725455 0.7621703 0.9412727 0.1133321 0.0238073 0.0757046 0.0515578 0.1607524 0.0331805 0.0486392 0.1810618 0.0459291 0.1291533 0.0459291 0.0486392 0.0466757 0.0905398
#FTIR CV results
print(paste('The optimal number of mtry for training the FTIR-RF model is',fit_rf_ftir$bestTune))
[1] "The optimal number of mtry for training the FTIR-RF model is 2"
#Output FTIR table
knitr::kable(fit_rf_ftir$results)
mtry logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
2 0.1224477 0.9888333 0.5615682 0.9541117 0.8654263 0.9703023 0.9633636 0.9216667 0.9799242 0.8960000 0.9799242 0.9633636 0.7546154 0.9425152 0.1434511 0.0230820 0.1251504 0.0543016 0.1638887 0.0350962 0.0540142 0.1648980 0.0416054 0.1519813 0.0416054 0.0540142 0.0460920 0.0854975
3 0.1921102 0.9881061 0.4850604 0.9500824 0.8514182 0.9678304 0.9623636 0.9050000 0.9761970 0.8918333 0.9761970 0.9623636 0.7538462 0.9336818 0.4875409 0.0247673 0.1395662 0.0542637 0.1694146 0.0347754 0.0542542 0.1776752 0.0437865 0.1570365 0.0437865 0.0542542 0.0464858 0.0907628
4 0.2855418 0.9875758 0.4191739 0.9525824 0.8638921 0.9690290 0.9603636 0.9266667 0.9808333 0.8908333 0.9808333 0.9603636 0.7523077 0.9435152 0.7009602 0.0270964 0.1390322 0.0552986 0.1613261 0.0365918 0.0582535 0.1594744 0.0410313 0.1538393 0.0410313 0.0582535 0.0496924 0.0825672
5 0.3013470 0.9870985 0.2775997 0.9509249 0.8587882 0.9680527 0.9622727 0.9116667 0.9764394 0.8900000 0.9764394 0.9622727 0.7537912 0.9369697 0.7039710 0.0248943 0.1228315 0.0586059 0.1702073 0.0386384 0.0579497 0.1561586 0.0410883 0.1627830 0.0410883 0.0579497 0.0492982 0.0856847
#The optimal selected model
selected_model<-fit_rf_ftir$results %>% filter(mtry==as.numeric(fit_rf_ftir$bestTune))
knitr::kable(selected_model)
mtry logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
2 0.1224477 0.9888333 0.5615682 0.9541117 0.8654263 0.9703023 0.9633636 0.9216667 0.9799242 0.896 0.9799242 0.9633636 0.7546154 0.9425152 0.1434511 0.023082 0.1251504 0.0543016 0.1638887 0.0350962 0.0540142 0.164898 0.0416054 0.1519813 0.0416054 0.0540142 0.046092 0.0854975
#UV-Vis CV results
print(paste('The optimal number of mtry for training the UV-Vis-RF model is',fit_rf_uvvis$bestTune))
[1] "The optimal number of mtry for training the UV-Vis-RF model is 2"
#Output UV-Vis table
knitr::kable(fit_rf_uvvis$results)
mtry logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
2 0.0375808 1.0000000 0.5561061 0.9916484 0.9723504 0.994888 1 0.9633333 0.990303 1 0.990303 1 0.7836264 0.9816667 0.0506710 0.0000000 0.1272036 0.0262694 0.0888021 0.0159877 0 0.1150318 0.0301704 0 0.0301704 0 0.0260631 0.0575159
3 0.0335648 1.0000000 0.3711061 0.9916484 0.9723504 0.994888 1 0.9633333 0.990303 1 0.990303 1 0.7836264 0.9816667 0.0714744 0.0000000 0.1372975 0.0262694 0.0888021 0.0159877 0 0.1150318 0.0301704 0 0.0301704 0 0.0260631 0.0575159
4 0.0358994 1.0000000 0.2206667 0.9916484 0.9723504 0.994888 1 0.9633333 0.990303 1 0.990303 1 0.7836264 0.9816667 0.0986380 0.0000000 0.1299046 0.0262694 0.0888021 0.0159877 0 0.1150318 0.0301704 0 0.0301704 0 0.0260631 0.0575159
5 0.0633852 0.9983333 0.0488438 0.9916484 0.9723504 0.994888 1 0.9633333 0.990303 1 0.990303 1 0.7836264 0.9816667 0.3037597 0.0166667 0.0770500 0.0262694 0.0888021 0.0159877 0 0.1150318 0.0301704 0 0.0301704 0 0.0260631 0.0575159
#The optimal selected model
selected_model<-fit_rf_uvvis$results %>% filter(mtry==as.numeric(fit_rf_uvvis$bestTune))
knitr::kable(selected_model)
mtry logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
2 0.0375808 1 0.5561061 0.9916484 0.9723504 0.994888 1 0.9633333 0.990303 1 0.990303 1 0.7836264 0.9816667 0.050671 0 0.1272036 0.0262694 0.0888021 0.0159877 0 0.1150318 0.0301704 0 0.0301704 0 0.0260631 0.0575159
#GC-MS CV results
print(paste('The optimal number of mtry for training the GC-MS-RF model is',fit_rf_gc$bestTune))
[1] "The optimal number of mtry for training the GC-MS-RF model is 2"
#Output GC-MS table
knitr::kable(fit_rf_gc$results)
mtry logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
2 0.1719807 0.9751944 0.7128553 0.9335673 0.7473550 0.9602766 0.9619583 0.7900000 0.9610093 0.8410000 0.9610093 0.9619583 0.8033041 0.8759792 0.0915327 0.0389475 0.0768440 0.0537230 0.2124190 0.0322625 0.0469677 0.2491512 0.0454670 0.1865635 0.0454670 0.0469677 0.0390237 0.1243059
3 0.1860490 0.9704792 0.6646548 0.9219591 0.7144364 0.9527705 0.9500833 0.7800000 0.9587301 0.7969524 0.9587301 0.9500833 0.7933918 0.8650417 0.1232623 0.0420231 0.0885783 0.0638149 0.2379034 0.0392547 0.0595649 0.2604494 0.0481215 0.2213449 0.0481215 0.0595649 0.0496167 0.1318778
4 0.2403225 0.9667847 0.6156494 0.9125146 0.6857036 0.9467841 0.9414167 0.7666667 0.9559794 0.7714524 0.9559794 0.9414167 0.7861696 0.8540417 0.2910960 0.0442123 0.1000990 0.0661982 0.2399602 0.0410319 0.0644024 0.2659080 0.0494073 0.2319237 0.0494073 0.0644024 0.0538957 0.1338380
5 0.3057595 0.9599653 0.5635310 0.9076023 0.6704488 0.9437785 0.9374583 0.7566667 0.9534226 0.7401667 0.9534226 0.9374583 0.7828655 0.8470625 0.4385199 0.0501104 0.1093446 0.0701865 0.2548637 0.0434640 0.0640848 0.2631715 0.0492389 0.2483740 0.0492389 0.0640848 0.0536483 0.1365726
#The optimal selected model
selected_model<-fit_rf_gc$results %>% filter(mtry==as.numeric(fit_rf_gc$bestTune))
knitr::kable(selected_model)
mtry logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
2 0.1719807 0.9751944 0.7128553 0.9335673 0.747355 0.9602766 0.9619583 0.79 0.9610093 0.841 0.9610093 0.9619583 0.8033041 0.8759792 0.0915327 0.0389475 0.076844 0.053723 0.212419 0.0322625 0.0469677 0.2491512 0.045467 0.1865635 0.045467 0.0469677 0.0390237 0.1243059

Test Random Forest Classification Models

Hyperspectral Imaging RF Test Results
#Predict HSI test set
test_hsi_rf<-predict(fit_rf_hsi,newdata=data_test_hsi_cf)

#get the confusion matrix
cfmatrix_hsi<-confusionMatrix(test_hsi_rf,data_test_hsi_cf$class_1)

#print the confusion matrix
print(cfmatrix_hsi)
Confusion Matrix and Statistics

             Reference
Prediction    Adulterated Pure_EVOO
  Adulterated          43         3
  Pure_EVOO             0         8
                                          
               Accuracy : 0.9444          
                 95% CI : (0.8461, 0.9884)
    No Information Rate : 0.7963          
    P-Value [Acc > NIR] : 0.002383        
                                          
                  Kappa : 0.8094          
                                          
 Mcnemar's Test P-Value : 0.248213        
                                          
            Sensitivity : 1.0000          
            Specificity : 0.7273          
         Pos Pred Value : 0.9348          
         Neg Pred Value : 1.0000          
             Prevalence : 0.7963          
         Detection Rate : 0.7963          
   Detection Prevalence : 0.8519          
      Balanced Accuracy : 0.8636          
                                          
       'Positive' Class : Adulterated     
                                          
knitr::kable(cfmatrix_hsi$byClass)
x
Sensitivity 1.0000000
Specificity 0.7272727
Pos Pred Value 0.9347826
Neg Pred Value 1.0000000
Precision 0.9347826
Recall 1.0000000
F1 0.9662921
Prevalence 0.7962963
Detection Rate 0.7962963
Detection Prevalence 0.8518519
Balanced Accuracy 0.8636364
#View the results as knitr table
knitr::kable(cfmatrix_hsi$table)
Adulterated Pure_EVOO
Adulterated 43 3
Pure_EVOO 0 8
#Calculating Matthews Correlation Coefficient (MC)
# Extracting the components of the confusion matrix
cfmatrix_hsi_rf_table <- cfmatrix_hsi$table
TP <- cfmatrix_hsi_rf_table[1,1] 
TN <- cfmatrix_hsi_rf_table[2,2] 
FP <- cfmatrix_hsi_rf_table[2,1] 
FN <- cfmatrix_hsi_rf_table[1,2] 

# Calculating MCC
MCC <- (TP * TN - FP * FN) / sqrt((TP + FP) * (TP + FN) * (TN + FP) * (TN + FN))

# Printing the MCC value
print(paste("The MCC value is for the model",round(MCC,2)))
[1] "The MCC value is for the model 0.82"
Test RF Raman Model on Test Data
#Predict Raman test set
test_raman_rf<-predict(fit_rf_raman,newdata=data_test_raman_cf)

#get the confusion matrix
cfmatrix_raman<-confusionMatrix(test_raman_rf,data_test_raman_cf$class_1)

#print the confusion matrix
print(cfmatrix_raman)
Confusion Matrix and Statistics

             Reference
Prediction    Adulterated Pure_EVOO
  Adulterated          40         2
  Pure_EVOO             3         9
                                         
               Accuracy : 0.9074         
                 95% CI : (0.797, 0.9692)
    No Information Rate : 0.7963         
    P-Value [Acc > NIR] : 0.02431        
                                         
                  Kappa : 0.7239         
                                         
 Mcnemar's Test P-Value : 1.00000        
                                         
            Sensitivity : 0.9302         
            Specificity : 0.8182         
         Pos Pred Value : 0.9524         
         Neg Pred Value : 0.7500         
             Prevalence : 0.7963         
         Detection Rate : 0.7407         
   Detection Prevalence : 0.7778         
      Balanced Accuracy : 0.8742         
                                         
       'Positive' Class : Adulterated    
                                         
knitr::kable(cfmatrix_raman$byClass)
x
Sensitivity 0.9302326
Specificity 0.8181818
Pos Pred Value 0.9523810
Neg Pred Value 0.7500000
Precision 0.9523810
Recall 0.9302326
F1 0.9411765
Prevalence 0.7962963
Detection Rate 0.7407407
Detection Prevalence 0.7777778
Balanced Accuracy 0.8742072
#View the results as knitr table
knitr::kable(cfmatrix_raman$table)
Adulterated Pure_EVOO
Adulterated 40 2
Pure_EVOO 3 9
#Calculating Matthews Correlation Coefficient (MC)
# Extracting the components of the confusion matrix
cfmatrix_raman_rf_table <- cfmatrix_raman$table
TP <- cfmatrix_raman_rf_table[1,1] 
TN <- cfmatrix_raman_rf_table[2,2] 
FP <- cfmatrix_raman_rf_table[2,1] 
FN <- cfmatrix_raman_rf_table[1,2] 

# Calculating MCC
MCC <- (TP * TN - FP * FN) / sqrt((TP + FP) * (TP + FN) * (TN + FP) * (TN + FN))

# Printing the MCC value
print(paste("The MCC value is for the model",round(MCC,2)))
[1] "The MCC value is for the model 0.73"
FTIR Spectroscopy RF Test Results
#Predict FTIR test set
test_ftir_rf<-predict(fit_rf_ftir,newdata=data_test_ftir_cf)

#get the confusion matrix
cfmatrix_ftir<-confusionMatrix(test_ftir_rf,data_test_ftir_cf$class_1)

#print the confusion matrix
print(cfmatrix_ftir)
Confusion Matrix and Statistics

             Reference
Prediction    Adulterated Pure_EVOO
  Adulterated          42         0
  Pure_EVOO             1        11
                                          
               Accuracy : 0.9815          
                 95% CI : (0.9011, 0.9995)
    No Information Rate : 0.7963          
    P-Value [Acc > NIR] : 6.741e-05       
                                          
                  Kappa : 0.9448          
                                          
 Mcnemar's Test P-Value : 1               
                                          
            Sensitivity : 0.9767          
            Specificity : 1.0000          
         Pos Pred Value : 1.0000          
         Neg Pred Value : 0.9167          
             Prevalence : 0.7963          
         Detection Rate : 0.7778          
   Detection Prevalence : 0.7778          
      Balanced Accuracy : 0.9884          
                                          
       'Positive' Class : Adulterated     
                                          
knitr::kable(cfmatrix_ftir$byClass)
x
Sensitivity 0.9767442
Specificity 1.0000000
Pos Pred Value 1.0000000
Neg Pred Value 0.9166667
Precision 1.0000000
Recall 0.9767442
F1 0.9882353
Prevalence 0.7962963
Detection Rate 0.7777778
Detection Prevalence 0.7777778
Balanced Accuracy 0.9883721
#View the results as knitr table
knitr::kable(cfmatrix_ftir$table)
Adulterated Pure_EVOO
Adulterated 42 0
Pure_EVOO 1 11
#Calculating Matthews Correlation Coefficient (MC)
# Extracting the components of the confusion matrix
cfmatrix_ftir_rf_table <- cfmatrix_ftir$table
TP <- cfmatrix_ftir_rf_table[1,1] 
TN <- cfmatrix_ftir_rf_table[2,2] 
FP <- cfmatrix_ftir_rf_table[2,1] 
FN <- cfmatrix_ftir_rf_table[1,2] 

# Calculating MCC
MCC <- (TP * TN - FP * FN) / sqrt((TP + FP) * (TP + FN) * (TN + FP) * (TN + FN))

# Printing the MCC value
print(paste("The MCC value is for the model",round(MCC,2)))
[1] "The MCC value is for the model 0.95"
Assess UV-Vis RF Model on Test Data
#Predict uvvis test set
test_uvvis_rf<-predict(fit_rf_uvvis,newdata=data_test_uvvis_cf)

#get the confusion matrix
cfmatrix_uvvis<-confusionMatrix(test_uvvis_rf,data_test_uvvis_cf$class_1)

#print the confusion matrix
print(cfmatrix_uvvis)
Confusion Matrix and Statistics

             Reference
Prediction    Adulterated Pure_EVOO
  Adulterated          42         2
  Pure_EVOO             1         9
                                          
               Accuracy : 0.9444          
                 95% CI : (0.8461, 0.9884)
    No Information Rate : 0.7963          
    P-Value [Acc > NIR] : 0.002383        
                                          
                  Kappa : 0.8228          
                                          
 Mcnemar's Test P-Value : 1.000000        
                                          
            Sensitivity : 0.9767          
            Specificity : 0.8182          
         Pos Pred Value : 0.9545          
         Neg Pred Value : 0.9000          
             Prevalence : 0.7963          
         Detection Rate : 0.7778          
   Detection Prevalence : 0.8148          
      Balanced Accuracy : 0.8975          
                                          
       'Positive' Class : Adulterated     
                                          
knitr::kable(cfmatrix_uvvis$byClass)
x
Sensitivity 0.9767442
Specificity 0.8181818
Pos Pred Value 0.9545455
Neg Pred Value 0.9000000
Precision 0.9545455
Recall 0.9767442
F1 0.9655172
Prevalence 0.7962963
Detection Rate 0.7777778
Detection Prevalence 0.8148148
Balanced Accuracy 0.8974630
#Calculating Matthews Correlation Coefficient (MC)
# Extracting the components of the confusion matrix
cfmatrix_uvvis_rf_table <- cfmatrix_uvvis$table
TP <- cfmatrix_uvvis_rf_table[1,1] 
TN <- cfmatrix_uvvis_rf_table[2,2] 
FP <- cfmatrix_uvvis_rf_table[2,1] 
FN <- cfmatrix_uvvis_rf_table[1,2] 

# Calculating MCC
MCC <- (TP * TN - FP * FN) / sqrt((TP + FP) * (TP + FN) * (TN + FP) * (TN + FN))

# Printing the MCC value
print(paste("The MCC value is for the model",round(MCC,2)))
[1] "The MCC value is for the model 0.82"
Assess GC-MS RF Model on Test Data
#Predict gc test set
test_gc_rf<-predict(fit_rf_gc,newdata=data_test_gc_cf)

#get the confusion matrix
cfmatrix_gc<-confusionMatrix(test_gc_rf,data_test_gc_cf$class_1)

#print the confusion matrix
print(cfmatrix_gc)
Confusion Matrix and Statistics

             Reference
Prediction    Adulterated Pure_EVOO
  Adulterated          62         2
  Pure_EVOO             2        10
                                          
               Accuracy : 0.9474          
                 95% CI : (0.8707, 0.9855)
    No Information Rate : 0.8421          
    P-Value [Acc > NIR] : 0.004605        
                                          
                  Kappa : 0.8021          
                                          
 Mcnemar's Test P-Value : 1.000000        
                                          
            Sensitivity : 0.9688          
            Specificity : 0.8333          
         Pos Pred Value : 0.9688          
         Neg Pred Value : 0.8333          
             Prevalence : 0.8421          
         Detection Rate : 0.8158          
   Detection Prevalence : 0.8421          
      Balanced Accuracy : 0.9010          
                                          
       'Positive' Class : Adulterated     
                                          
knitr::kable(cfmatrix_gc$byClass)
x
Sensitivity 0.9687500
Specificity 0.8333333
Pos Pred Value 0.9687500
Neg Pred Value 0.8333333
Precision 0.9687500
Recall 0.9687500
F1 0.9687500
Prevalence 0.8421053
Detection Rate 0.8157895
Detection Prevalence 0.8421053
Balanced Accuracy 0.9010417
#Calculating Matthews Correlation Coefficient (MC)
# Extracting the components of the confusion matrix
cfmatrix_gc_rf_table <- cfmatrix_gc$table
TP <- cfmatrix_gc_rf_table[1,1] 
TN <- cfmatrix_gc_rf_table[2,2] 
FP <- cfmatrix_gc_rf_table[2,1] 
FN <- cfmatrix_gc_rf_table[1,2] 

# Calculating MCC
MCC <- (TP * TN - FP * FN) / sqrt((TP + FP) * (TP + FN) * (TN + FP) * (TN + FN))

# Printing the MCC value
print(paste("The MCC value is for the model",round(MCC,2)))
[1] "The MCC value is for the model 0.8"
Plot Confusion Matrix Tables for RF Binary Classification Algorithm
# Plotting the confusion matrix

#HSI RF confusion Matrix
cf_hsi_rf<-ggplot(data = as.data.frame(cfmatrix_hsi$table), aes(x = Reference, y = Prediction)) +
  geom_tile(aes(fill = Freq), color = "black") +
  geom_text(aes(label = sprintf("%0.0f", Freq)), vjust = 1,color = "black",size = 4) +
  scale_fill_gradient(low = "white", high = "#99ccff", name = "Frequency") +
  theme_minimal() +
  labs(x = "Actual Class", y = "Predicted Class", color = 'black',title = 'Confusion Matrix HSI RF')+
  theme(
    legend.position =  "none",
    axis.text.y = element_text(color = "black",size = 8),
    axis.title.x  = element_text(size = 8),
    axis.title.y = element_text(size = 8),aspect.ratio = 1,
    plot.title = element_text(size = 8,hjust = 0.5))

#Raman RF confusion Matrix
cf_raman_rf<-ggplot(data = as.data.frame(cfmatrix_raman$table), aes(x = Reference, y = Prediction)) +
  geom_tile(aes(fill = Freq), color = "black") +
  geom_text(aes(label = sprintf("%0.0f", Freq)), vjust = 1,color = "black",size = 4) +
  scale_fill_gradient(low = "gray84", high = "darkorange3", name = "Frequency") +
  theme_minimal() +
  labs(x = "Actual Class", y = "Predicted Class", color = 'black',title = 'Confusion Matrix Raman RF')+
  theme(
    legend.position =  "none",
    axis.text.y = element_text(color = "black",size = 8),
    axis.title.x  = element_text(size = 8),
    axis.title.y = element_text(size = 8),aspect.ratio = 1,
    plot.title = element_text(size = 8,hjust = 0.5))

#FTIR RF confusion Matrix
cf_ftir_rf<-ggplot(data = as.data.frame(cfmatrix_ftir$table), aes(x = Reference, y = Prediction)) +
  geom_tile(aes(fill = Freq), color = "black") +
  geom_text(aes(label = sprintf("%0.0f", Freq)), vjust = 1,color = "black",size = 4) +
  scale_fill_gradient(low = "gray84", high = "darkseagreen2", name = "Frequency") +
  theme_minimal() +
  labs(x = "Actual Class", y = "Predicted Class", color = 'black',title = 'Confusion Matrix FTIR RF')+
  theme(
    legend.position =  "none",
    axis.text.y = element_text(color = "black",size = 8),
    axis.title.x  = element_text(size = 8),
    axis.title.y = element_text(size = 8),aspect.ratio = 1,
    plot.title = element_text(size = 8,hjust = 0.5))

#UV-Vis RF confusion Matrix
cf_uvvis_rf<-ggplot(data = as.data.frame(cfmatrix_uvvis$table), aes(x = Reference, y = Prediction)) +
  geom_tile(aes(fill = Freq), color = "black") +
  geom_text(aes(label = sprintf("%0.0f", Freq)), vjust = 1,color = "black",size = 4) +
  scale_fill_gradient(low = "azure1", high = "turquoise", name = "Frequency") +
  theme_minimal() +
  labs(x = "Actual Class", y = "Predicted Class", color = 'black',title = 'Confusion Matrix UV-Vis RF')+
  theme(
    legend.position =  "none",
    axis.text.y = element_text(color = "black",size = 8),
    axis.title.x  = element_text(size = 8),
    axis.title.y = element_text(size = 8),aspect.ratio = 1,
    plot.title = element_text(size = 8,hjust = 0.5))
library(grid)
grid.arrange(cf_hsi_rf,cf_raman_rf,cf_ftir_rf,cf_uvvis_rf,nrow = 2)

#GC-MS RF confusion Matrix
ggplot(data = as.data.frame(cfmatrix_gc$table), aes(x = Reference, y = Prediction)) +
  geom_tile(aes(fill = Freq), color = "black") +
  geom_text(aes(label = sprintf("%0.0f", Freq)), vjust = 1,color = "black",size = 4) +
  scale_fill_gradient(low = "azure1", high = "tan1", name = "Frequency") +
  theme_minimal() +
  labs(x = "Actual Class", y = "Predicted Class", color = 'black',title = 'Confusion Matrix GC-MS RF')+
  theme(
    legend.position =  "none",
    axis.text.y = element_text(color = "black",size = 8),
    axis.title.x  = element_text(size = 8),
    axis.title.y = element_text(size = 8),aspect.ratio = 1,
    plot.title = element_text(size = 8,hjust = 0.5))

Model 3. Linear Discriminant Analysis (LDA)

  • Linear Discriminant Analysis (LDA) is a supervised machine learning algorithm used for classification and dimensionality reduction. It is based on finding a linear combination of features that separates two or more classes of data by maximizing the separation between classes while minimizing the variation within each class.
#Register cluster for caret to train the models in parallel
cl<-makeCluster(6,type = "SOCK")
suppressWarnings(suppressMessages(
  registerDoSNOW(cl)))

#start_time
start_time<-Sys.time()

#Train LDA models for all the techniques

#HSI LDA model
fit_lda_hsi<-train(y=data_train_hsi_cf[,1],x=data_train_hsi_cf[,-1],
                   method = "lda",trControl = control,metric = metric)

#Raman LDA model
fit_lda_raman<-train(y=data_train_raman_cf[,1],x=data_train_raman_cf[,-1],
                   method = "lda",trControl = control,metric = metric)

#FTIR LDA model
fit_lda_ftir<-train(y=data_train_ftir_cf[,1],x=data_train_ftir_cf[,-1],
                   method = "lda",trControl = control,metric = metric)

#UV-Vis LDA model
fit_lda_uvvis<-train(y=data_train_uvvis_cf[,1],x=data_train_uvvis_cf[,-1],
                   method = "lda",trControl = control,metric = metric)
#GC-MS LDA model
fit_lda_gc<-train(y=data_train_gc_cf[,1],x=data_train_gc_cf[,-1],
                   method = "lda",trControl = control,metric = metric)
#End_time
end_time<-Sys.time()
model_training_time<-end_time-start_time
print(model_training_time)
Time difference of 15.52287 secs
stopCluster(cl)#stop the parallel run cluster
Display LDA cross-validation results
#HSI LDA CV results
#Output HSI RF table
knitr::kable(fit_lda_hsi$results)
parameter logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
none 0.0284638 1 0.7546667 0.9912729 0.9757612 0.9942523 0.9891818 1 1 0.9685 1 0.9891818 0.7749359 0.9945909 0.1031222 0 0.0463641 0.0272712 0.0744303 0.0182485 0.0340566 0 0 0.0961345 0 0.0340566 0.0312723 0.0170283
#Raman LDA CV results
#Output HSI RF table
knitr::kable(fit_lda_raman$results)
parameter logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
none 0.1913861 0.9892273 0.7501027 0.9154853 0.7750724 0.9429123 0.9180909 0.9116667 0.9756869 0.7985 0.9756869 0.9180909 0.7191758 0.9148788 0.1103101 0.0268968 0.0490102 0.0768107 0.2009449 0.0533472 0.0882201 0.1841884 0.0500591 0.1977439 0.0500591 0.0882201 0.0705581 0.0972957
#FTIR LDA CV results
#Output FTIR LDA table
knitr::kable(fit_lda_ftir$results)
parameter logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
none 0.2047317 0.9965303 0.7547326 0.9661172 0.9166844 0.9764475 0.9567273 1 1 0.8975 1 0.9567273 0.7497436 0.9783636 0.3734404 0.0133976 0.0553712 0.0561455 0.1304959 0.0401466 0.0709272 0 0 0.1532728 0 0.0709272 0.0617416 0.0354636
#UV-Vis LDA CV results
#Output UV-Vis table
knitr::kable(fit_lda_uvvis$results)
parameter logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
none 0.131119 1 0.758 0.9777198 0.924386 0.9863994 1 0.8983333 0.9740909 1 0.9740909 1 0.783663 0.9491667 0.2664903 0 0.053206 0.0367372 0.1273637 0.0223567 0 0.1689988 0.0424633 0 0.0424633 0 0.0257429 0.0844994
#GC LDA CV results
#Output HSI RF table
knitr::kable(fit_lda_gc$results)
parameter logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
none 0.5150056 0.8976528 0.6816489 0.8084211 0.5152486 0.869651 0.793625 0.8833333 0.9726396 0.4986548 0.9726396 0.793625 0.6627193 0.8384792 0.2335252 0.087154 0.0741628 0.101874 0.2133209 0.0759384 0.1131543 0.1857735 0.0432255 0.1815929 0.0432255 0.1131543 0.0944504 0.1127338

Test LDA Models

Hyperspectral Imaging LDA Test Results
#Predict HSI test set
test_hsi_lda<-predict(fit_lda_hsi,newdata=data_test_hsi_cf)

#get the confusion matrix
cfmatrix_hsi<-confusionMatrix(test_hsi_lda,data_test_hsi_cf$class_1)

#print the confusion matrix
print(cfmatrix_hsi)
Confusion Matrix and Statistics

             Reference
Prediction    Adulterated Pure_EVOO
  Adulterated          43         0
  Pure_EVOO             0        11
                                     
               Accuracy : 1          
                 95% CI : (0.934, 1) 
    No Information Rate : 0.7963     
    P-Value [Acc > NIR] : 4.55e-06   
                                     
                  Kappa : 1          
                                     
 Mcnemar's Test P-Value : NA         
                                     
            Sensitivity : 1.0000     
            Specificity : 1.0000     
         Pos Pred Value : 1.0000     
         Neg Pred Value : 1.0000     
             Prevalence : 0.7963     
         Detection Rate : 0.7963     
   Detection Prevalence : 0.7963     
      Balanced Accuracy : 1.0000     
                                     
       'Positive' Class : Adulterated
                                     
knitr::kable(cfmatrix_hsi$byClass)
x
Sensitivity 1.0000000
Specificity 1.0000000
Pos Pred Value 1.0000000
Neg Pred Value 1.0000000
Precision 1.0000000
Recall 1.0000000
F1 1.0000000
Prevalence 0.7962963
Detection Rate 0.7962963
Detection Prevalence 0.7962963
Balanced Accuracy 1.0000000
#View the results as knitr table
knitr::kable(cfmatrix_hsi$table)
Adulterated Pure_EVOO
Adulterated 43 0
Pure_EVOO 0 11
#Calculating Matthews Correlation Coefficient (MC)
# Extracting the components of the confusion matrix
cfmatrix_hsi_lda_table <- cfmatrix_hsi$table
TP <- cfmatrix_hsi_lda_table[1,1] 
TN <- cfmatrix_hsi_lda_table[2,2] 
FP <- cfmatrix_hsi_lda_table[2,1] 
FN <- cfmatrix_hsi_lda_table[1,2] 

# Calculating MCC
MCC <- (TP * TN - FP * FN) / sqrt((TP + FP) * (TP + FN) * (TN + FP) * (TN + FN))

# Printing the MCC value
print(paste("The MCC value is for the model",MCC))
[1] "The MCC value is for the model 1"
Test LDA Raman Model on Test Data
#Predict Raman test set
test_raman_lda<-predict(fit_lda_raman,newdata=data_test_raman_cf)

#get the confusion matrix
cfmatrix_raman<-confusionMatrix(test_raman_lda,data_test_raman_cf$class_1)

#print the confusion matrix
print(cfmatrix_raman)
Confusion Matrix and Statistics

             Reference
Prediction    Adulterated Pure_EVOO
  Adulterated          33         4
  Pure_EVOO            10         7
                                          
               Accuracy : 0.7407          
                 95% CI : (0.6035, 0.8504)
    No Information Rate : 0.7963          
    P-Value [Acc > NIR] : 0.8796          
                                          
                  Kappa : 0.3357          
                                          
 Mcnemar's Test P-Value : 0.1814          
                                          
            Sensitivity : 0.7674          
            Specificity : 0.6364          
         Pos Pred Value : 0.8919          
         Neg Pred Value : 0.4118          
             Prevalence : 0.7963          
         Detection Rate : 0.6111          
   Detection Prevalence : 0.6852          
      Balanced Accuracy : 0.7019          
                                          
       'Positive' Class : Adulterated     
                                          
knitr::kable(cfmatrix_raman$byClass)
x
Sensitivity 0.7674419
Specificity 0.6363636
Pos Pred Value 0.8918919
Neg Pred Value 0.4117647
Precision 0.8918919
Recall 0.7674419
F1 0.8250000
Prevalence 0.7962963
Detection Rate 0.6111111
Detection Prevalence 0.6851852
Balanced Accuracy 0.7019027
#View the results as knitr table
knitr::kable(cfmatrix_raman$table)
Adulterated Pure_EVOO
Adulterated 33 4
Pure_EVOO 10 7
#Calculating Matthews Correlation Coefficient (MC)
# Extracting the components of the confusion matrix
cfmatrix_raman_lda_table <- cfmatrix_raman$table
TP <- cfmatrix_raman_lda_table[1,1] 
TN <- cfmatrix_raman_lda_table[2,2] 
FP <- cfmatrix_raman_lda_table[2,1] 
FN <- cfmatrix_raman_lda_table[1,2] 

# Calculating MCC
MCC <- (TP * TN - FP * FN) / sqrt((TP + FP) * (TP + FN) * (TN + FP) * (TN + FN))

# Printing the MCC value
print(paste("The MCC value is for the model",round(MCC,2)))
[1] "The MCC value is for the model 0.35"
FTIR Spectroscopy LDA Test Results
#Predict FTIR test set
test_ftir_lda<-predict(fit_lda_ftir,newdata=data_test_ftir_cf)

#get the confusion matrix
cfmatrix_ftir<-confusionMatrix(test_ftir_lda,data_test_ftir_cf$class_1)

#print the confusion matrix
print(cfmatrix_ftir)
Confusion Matrix and Statistics

             Reference
Prediction    Adulterated Pure_EVOO
  Adulterated          40         0
  Pure_EVOO             3        11
                                          
               Accuracy : 0.9444          
                 95% CI : (0.8461, 0.9884)
    No Information Rate : 0.7963          
    P-Value [Acc > NIR] : 0.002383        
                                          
                  Kappa : 0.8445          
                                          
 Mcnemar's Test P-Value : 0.248213        
                                          
            Sensitivity : 0.9302          
            Specificity : 1.0000          
         Pos Pred Value : 1.0000          
         Neg Pred Value : 0.7857          
             Prevalence : 0.7963          
         Detection Rate : 0.7407          
   Detection Prevalence : 0.7407          
      Balanced Accuracy : 0.9651          
                                          
       'Positive' Class : Adulterated     
                                          
knitr::kable(cfmatrix_ftir$byClass)
x
Sensitivity 0.9302326
Specificity 1.0000000
Pos Pred Value 1.0000000
Neg Pred Value 0.7857143
Precision 1.0000000
Recall 0.9302326
F1 0.9638554
Prevalence 0.7962963
Detection Rate 0.7407407
Detection Prevalence 0.7407407
Balanced Accuracy 0.9651163
#View the results as knitr table
knitr::kable(cfmatrix_ftir$table)
Adulterated Pure_EVOO
Adulterated 40 0
Pure_EVOO 3 11
#Calculating Matthews Correlation Coefficient (MC)
# Extracting the components of the confusion matrix
cfmatrix_ftir_lda_table <- cfmatrix_ftir$table
TP <- cfmatrix_ftir_lda_table[1,1] 
TN <- cfmatrix_ftir_lda_table[2,2] 
FP <- cfmatrix_ftir_lda_table[2,1] 
FN <- cfmatrix_ftir_lda_table[1,2] 

# Calculating MCC
MCC <- (TP * TN - FP * FN) / sqrt((TP + FP) * (TP + FN) * (TN + FP) * (TN + FN))

# Printing the MCC value
print(paste("The MCC value is for the model",round(MCC,2)))
[1] "The MCC value is for the model 0.85"
Assess UV-Vis LDA Model on Test Data
#Predict uvvis test set
test_uvvis_lda<-predict(fit_lda_uvvis,newdata=data_test_uvvis_cf)

#get the confusion matrix
cfmatrix_uvvis<-confusionMatrix(test_uvvis_lda,data_test_uvvis_cf$class_1)

#print the confusion matrix
print(cfmatrix_uvvis)
Confusion Matrix and Statistics

             Reference
Prediction    Adulterated Pure_EVOO
  Adulterated          43         4
  Pure_EVOO             0         7
                                          
               Accuracy : 0.9259          
                 95% CI : (0.8211, 0.9794)
    No Information Rate : 0.7963          
    P-Value [Acc > NIR] : 0.008546        
                                          
                  Kappa : 0.7359          
                                          
 Mcnemar's Test P-Value : 0.133614        
                                          
            Sensitivity : 1.0000          
            Specificity : 0.6364          
         Pos Pred Value : 0.9149          
         Neg Pred Value : 1.0000          
             Prevalence : 0.7963          
         Detection Rate : 0.7963          
   Detection Prevalence : 0.8704          
      Balanced Accuracy : 0.8182          
                                          
       'Positive' Class : Adulterated     
                                          
knitr::kable(cfmatrix_uvvis$byClass)
x
Sensitivity 1.0000000
Specificity 0.6363636
Pos Pred Value 0.9148936
Neg Pred Value 1.0000000
Precision 0.9148936
Recall 1.0000000
F1 0.9555556
Prevalence 0.7962963
Detection Rate 0.7962963
Detection Prevalence 0.8703704
Balanced Accuracy 0.8181818
#Calculating Matthews Correlation Coefficient (MC)
# Extracting the components of the confusion matrix
cfmatrix_uvvis_lda_table <- cfmatrix_uvvis$table
TP <- cfmatrix_uvvis_lda_table[1,1] 
TN <- cfmatrix_uvvis_lda_table[2,2] 
FP <- cfmatrix_uvvis_lda_table[2,1] 
FN <- cfmatrix_uvvis_lda_table[1,2] 

# Calculating MCC
MCC <- (TP * TN - FP * FN) / sqrt((TP + FP) * (TP + FN) * (TN + FP) * (TN + FN))

# Printing the MCC value
print(paste("The MCC value is for the model",round(MCC,2)))
[1] "The MCC value is for the model 0.76"
Assess GC-MS LDA Model on Test Data
#Predict gc test set
test_gc_lda<-predict(fit_lda_gc,newdata=data_test_gc_cf)

#get the confusion matrix
cfmatrix_gc<-confusionMatrix(test_gc_lda,data_test_gc_cf$class_1)

#print the confusion matrix
print(cfmatrix_gc)
Confusion Matrix and Statistics

             Reference
Prediction    Adulterated Pure_EVOO
  Adulterated          49         1
  Pure_EVOO            15        11
                                          
               Accuracy : 0.7895          
                 95% CI : (0.6808, 0.8746)
    No Information Rate : 0.8421          
    P-Value [Acc > NIR] : 0.917289        
                                          
                  Kappa : 0.4629          
                                          
 Mcnemar's Test P-Value : 0.001154        
                                          
            Sensitivity : 0.7656          
            Specificity : 0.9167          
         Pos Pred Value : 0.9800          
         Neg Pred Value : 0.4231          
             Prevalence : 0.8421          
         Detection Rate : 0.6447          
   Detection Prevalence : 0.6579          
      Balanced Accuracy : 0.8411          
                                          
       'Positive' Class : Adulterated     
                                          
knitr::kable(cfmatrix_gc$byClass)
x
Sensitivity 0.7656250
Specificity 0.9166667
Pos Pred Value 0.9800000
Neg Pred Value 0.4230769
Precision 0.9800000
Recall 0.7656250
F1 0.8596491
Prevalence 0.8421053
Detection Rate 0.6447368
Detection Prevalence 0.6578947
Balanced Accuracy 0.8411458
#Calculating Matthews Correlation Coefficient (MC)
# Extracting the components of the confusion matrix
cfmatrix_gc_lda_table <- cfmatrix_gc$table
TP <- cfmatrix_gc_lda_table[1,1] 
TN <- cfmatrix_gc_lda_table[2,2] 
FP <- cfmatrix_gc_lda_table[2,1] 
FN <- cfmatrix_gc_lda_table[1,2] 

# Calculating MCC
MCC <- (TP * TN - FP * FN) / sqrt((TP + FP) * (TP + FN) * (TN + FP) * (TN + FN))

# Printing the MCC value
print(paste("The MCC value is for the model",round(MCC,2)))
[1] "The MCC value is for the model 0.52"
Plot Confusion Matrix Tables for LDA Binary Classification Algorithm
# Plotting the confusion matrix

#HSI LDA confusion Matrix
cf_hsi_lda<-ggplot(data = as.data.frame(cfmatrix_hsi$table), aes(x = Reference, y = Prediction)) +
  geom_tile(aes(fill = Freq), color = "black") +
  geom_text(aes(label = sprintf("%0.0f", Freq)), vjust = 1,color = "black",size = 4) +
  scale_fill_gradient(low = "white", high = "#99ccff", name = "Frequency") +
  theme_minimal() +
  labs(x = "Actual Class", y = "Predicted Class", color = 'black',title = 'Confusion Matrix HSI LDA')+
  theme(
    legend.position =  "none",
    axis.text.y = element_text(color = "black",size = 8),
    axis.title.x  = element_text(size = 8),
    axis.title.y = element_text(size = 8),aspect.ratio = 1,
    plot.title = element_text(size = 8,hjust = 0.5))

#Raman LDA confusion Matrix
cf_raman_lda<-ggplot(data = as.data.frame(cfmatrix_raman$table), aes(x = Reference, y = Prediction)) +
  geom_tile(aes(fill = Freq), color = "black") +
  geom_text(aes(label = sprintf("%0.0f", Freq)), vjust = 1,color = "black",size = 4) +
  scale_fill_gradient(low = "gray84", high = "darkorange3", name = "Frequency") +
  theme_minimal() +
  labs(x = "Actual Class", y = "Predicted Class", color = 'black',title = 'Confusion Matrix Raman LDA')+
  theme(
    legend.position =  "none",
    axis.text.y = element_text(color = "black",size = 8),
    axis.title.x  = element_text(size = 8),
    axis.title.y = element_text(size = 8),aspect.ratio = 1,
    plot.title = element_text(size = 8,hjust = 0.5))

#FTIR LDA confusion Matrix
cf_ftir_lda<-ggplot(data = as.data.frame(cfmatrix_ftir$table), aes(x = Reference, y = Prediction)) +
  geom_tile(aes(fill = Freq), color = "black") +
  geom_text(aes(label = sprintf("%0.0f", Freq)), vjust = 1,color = "black",size = 4) +
  scale_fill_gradient(low = "gray84", high = "darkseagreen2", name = "Frequency") +
  theme_minimal() +
  labs(x = "Actual Class", y = "Predicted Class", color = 'black',title = 'Confusion Matrix FTIR LDA')+
  theme(
    legend.position =  "none",
    axis.text.y = element_text(color = "black",size = 8),
    axis.title.x  = element_text(size = 8),
    axis.title.y = element_text(size = 8),aspect.ratio = 1,
    plot.title = element_text(size = 8,hjust = 0.5))

#UV-Vis LDA confusion Matrix
cf_uvvis_lda<-ggplot(data = as.data.frame(cfmatrix_uvvis$table), aes(x = Reference, y = Prediction)) +
  geom_tile(aes(fill = Freq), color = "black") +
  geom_text(aes(label = sprintf("%0.0f", Freq)), vjust = 1,color = "black",size = 4) +
  scale_fill_gradient(low = "azure1", high = "turquoise", name = "Frequency") +
  theme_minimal() +
  labs(x = "Actual Class", y = "Predicted Class", color = 'black',title = 'Confusion Matrix UV-Vis LDA')+
  theme(
    legend.position =  "none",
    axis.text.y = element_text(color = "black",size = 8),
    axis.title.x  = element_text(size = 8),
    axis.title.y = element_text(size = 8),aspect.ratio = 1,
    plot.title = element_text(size = 8,hjust = 0.5))
library(grid)
grid.arrange(cf_hsi_lda,cf_raman_lda,cf_ftir_lda,cf_uvvis_lda,nrow = 2)

#GC-MS RF confusion Matrix
ggplot(data = as.data.frame(cfmatrix_gc$table), aes(x = Reference, y = Prediction)) +
  geom_tile(aes(fill = Freq), color = "black") +
  geom_text(aes(label = sprintf("%0.0f", Freq)), vjust = 1,color = "black",size = 4) +
  scale_fill_gradient(low = "azure1", high = "tan1", name = "Frequency") +
  theme_minimal() +
  labs(x = "Actual Class", y = "Predicted Class", color = 'black',title = 'Confusion Matrix GC-MS LDA')+
  theme(
    legend.position =  "none",
    axis.text.y = element_text(color = "black",size = 8),
    axis.title.x  = element_text(size = 8),
    axis.title.y = element_text(size = 8),aspect.ratio = 1,
    plot.title = element_text(size = 8,hjust = 0.5))

Model 4. Support Vector Machines (SVM)

  • Support Vector Machine (SVM) with Radial Basis Function (RBF) Kernel is a supervised machine learning algorithm used for classification and regression tasks. The RBF kernel introduces non-linear decision boundaries, allowing SVM to handle more complex datasets.
#Register cluster for caret to train the models in parallel
cl<-makeCluster(6,type = "SOCK")
suppressWarnings(suppressMessages(
  registerDoSNOW(cl)))

#start_time
start_time<-Sys.time()

#Set tuneLength to find optimal value of cost and gamma on cross-validation

#Train SVM models for all the techniques

#HSI SVM model

fit_svm_hsi<-train(class_1~.,data = data_train_hsi_cf,
                   method = "svmRadial",tuneLength = 10,trControl = control,metric = metric)

#Raman SVM model
fit_svm_raman<-train(class_1~.,data = data_train_raman_cf,
                   method = "svmRadial",tuneLength = 10,trControl = control,metric = metric)
Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo,
: There were missing values in resampled performance measures.
#FTIR SVM model
fit_svm_ftir<-train(class_1~.,data = data_train_ftir_cf,
                   method = "svmRadial",tuneLength = 10,trControl = control,metric = metric)

#UV-Vis SVM model
fit_svm_uvvis<-train(class_1~.,data = data_train_uvvis_cf,
                   method = "svmRadial",tuneLength = 10,trControl = control,metric = metric)
#GC-MS SVM model
fit_svm_gc<-train(class_1~.,data = data_train_gc_cf,
                 method = "svmRadial",tuneLength = 10,trControl = control,metric = metric)
Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo,
: There were missing values in resampled performance measures.
#End_time
end_time<-Sys.time()
model_training_time<-end_time-start_time
print(model_training_time)
Time difference of 1.141478 mins
stopCluster(cl)#stop the parallel run cluster
Display SVM cross-validation results
#HSI SVM CV results
print(paste('The optimal parameters for training the HSI-SVM model are sigma value of',round(as.numeric(fit_svm_hsi$bestTune$sigma),2),'and Cost parameter of',fit_svm_hsi$bestTune$C))
[1] "The optimal parameters for training the HSI-SVM model are sigma value of 0.44 and Cost parameter of 0.25"
#Output HSI SVM table
knitr::kable(fit_svm_hsi$results)
sigma C logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
0.4374344 0.25 0.0145542 1 0.7666212 1 1 1 1 1 1 1 1 1 0.7835897 1 0.0069286 0 0.0335438 0 0 0 0 0 0 0 0 0 0.0263793 0
0.4374344 0.50 0.0135823 1 0.7671212 1 1 1 1 1 1 1 1 1 0.7835897 1 0.0055804 0 0.0334150 0 0 0 0 0 0 0 0 0 0.0263793 0
0.4374344 1.00 0.0132856 1 0.7671212 1 1 1 1 1 1 1 1 1 0.7835897 1 0.0051627 0 0.0334150 0 0 0 0 0 0 0 0 0 0.0263793 0
0.4374344 2.00 0.0131436 1 0.7671212 1 1 1 1 1 1 1 1 1 0.7835897 1 0.0051379 0 0.0334150 0 0 0 0 0 0 0 0 0 0.0263793 0
0.4374344 4.00 0.0130916 1 0.7671212 1 1 1 1 1 1 1 1 1 0.7835897 1 0.0051313 0 0.0334150 0 0 0 0 0 0 0 0 0 0.0263793 0
0.4374344 8.00 0.0130363 1 0.7671212 1 1 1 1 1 1 1 1 1 0.7835897 1 0.0051126 0 0.0334150 0 0 0 0 0 0 0 0 0 0.0263793 0
0.4374344 16.00 0.0131602 1 0.7671212 1 1 1 1 1 1 1 1 1 0.7835897 1 0.0051175 0 0.0334150 0 0 0 0 0 0 0 0 0 0.0263793 0
0.4374344 32.00 0.0131417 1 0.7671212 1 1 1 1 1 1 1 1 1 0.7835897 1 0.0051195 0 0.0334150 0 0 0 0 0 0 0 0 0 0.0263793 0
0.4374344 64.00 0.0131177 1 0.7671212 1 1 1 1 1 1 1 1 1 0.7835897 1 0.0051598 0 0.0334150 0 0 0 0 0 0 0 0 0 0.0263793 0
0.4374344 128.00 0.0130504 1 0.7671212 1 1 1 1 1 1 1 1 1 0.7835897 1 0.0051417 0 0.0334150 0 0 0 0 0 0 0 0 0 0.0263793 0
#The optimal selected model
selected_model<-fit_svm_hsi$results %>% filter(C==as.numeric(fit_svm_hsi$bestTune$C))
knitr::kable(selected_model)
sigma C logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
0.4374344 0.25 0.0145542 1 0.7666212 1 1 1 1 1 1 1 1 1 0.7835897 1 0.0069286 0 0.0335438 0 0 0 0 0 0 0 0 0 0.0263793 0
#Raman SVM CV results
print(paste('The optimal parameters for training the raman-SVM model are sigma value of',round(as.numeric(fit_svm_raman$bestTune$sigma),2),'and Cost parameter of',fit_svm_raman$bestTune$C))
[1] "The optimal parameters for training the raman-SVM model are sigma value of 0.49 and Cost parameter of 1"
#Output Raman SVM  table
knitr::kable(fit_svm_raman$results)
sigma C logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
0.4929892 0.25 0.2274672 0.9609242 0.7196617 0.8851374 0.6555490 0.9259118 0.9251818 0.7466667 0.9333895 0.7750000 0.9333895 0.9251818 0.7246062 0.8359242 0.1398819 0.0649300 0.0710213 0.0961442 0.2927728 0.0628968 0.0889491 0.2905933 0.0749738 0.2573796 0.0749738 0.0889491 0.0696821 0.1526554
0.4929892 0.50 0.1828080 0.9752727 0.7330134 0.9244505 0.7619631 0.9524320 0.9641818 0.7883333 0.9447315 0.8804714 0.9447315 0.9641818 0.7554396 0.8762576 0.1403947 0.0476020 0.0629070 0.0746727 0.2440205 0.0471058 0.0590124 0.2538757 0.0645822 0.1983498 0.0645822 0.0590124 0.0505594 0.1312179
0.4929892 1.00 0.1606175 0.9823333 0.7433337 0.9284158 0.7682105 0.9551819 0.9681818 0.7900000 0.9466865 0.8982993 0.9466865 0.9681818 0.7586447 0.8790909 0.1205226 0.0366275 0.0536193 0.0713983 0.2526799 0.0441219 0.0564335 0.2697008 0.0657914 0.1940774 0.0657914 0.0564335 0.0496724 0.1357847
0.4929892 2.00 0.1595678 0.9839848 0.7446320 0.9260531 0.7628853 0.9536365 0.9651818 0.7900000 0.9461208 0.8887755 0.9461208 0.9651818 0.7563370 0.8775909 0.1227819 0.0340677 0.0526921 0.0732824 0.2479212 0.0456799 0.0555322 0.2644478 0.0661711 0.1827431 0.0661711 0.0555322 0.0495140 0.1347795
0.4929892 4.00 0.1576467 0.9839242 0.7433892 0.9261172 0.7621202 0.9536008 0.9641818 0.7916667 0.9466562 0.8824742 0.9466562 0.9641818 0.7556319 0.8779242 0.1296206 0.0372488 0.0546428 0.0724567 0.2534667 0.0450347 0.0539580 0.2631236 0.0643213 0.1824374 0.0643213 0.0539580 0.0497045 0.1348823
0.4929892 8.00 0.1601028 0.9807273 0.7410087 0.9281044 0.7680857 0.9548282 0.9631818 0.8016667 0.9503632 0.8811224 0.9503632 0.9631818 0.7547344 0.8824242 0.1382108 0.0351721 0.0524995 0.0672392 0.2387877 0.0418011 0.0542135 0.2591697 0.0619971 0.1831636 0.0619971 0.0542135 0.0480599 0.1301467
0.4929892 16.00 0.1828568 0.9739545 0.7350754 0.9166209 0.7308781 0.9479098 0.9631818 0.7533333 0.9367929 0.8738832 0.9367929 0.9631818 0.7546703 0.8582576 0.1662031 0.0644545 0.0678747 0.0767061 0.2618727 0.0476194 0.0578199 0.2674651 0.0665730 0.2007063 0.0665730 0.0578199 0.0497008 0.1383955
0.4929892 32.00 0.1745291 0.9772879 0.7374678 0.9266300 0.7612260 0.9542957 0.9700909 0.7716667 0.9421911 0.9000000 0.9421911 0.9700909 0.7601282 0.8708788 0.1520347 0.0479996 0.0619125 0.0668671 0.2266585 0.0414539 0.0501321 0.2480621 0.0617538 0.1774728 0.0617538 0.0501321 0.0448243 0.1261530
0.4929892 64.00 0.1854553 0.9758333 0.7415406 0.9352930 0.7833797 0.9598988 0.9750909 0.7933333 0.9483601 0.9183849 0.9483601 0.9750909 0.7641026 0.8842121 0.1767205 0.0631353 0.0631831 0.0732380 0.2584463 0.0453826 0.0498704 0.2733177 0.0667477 0.1670912 0.0667477 0.0498704 0.0456302 0.1404556
0.4929892 128.00 0.1961176 0.9688788 0.7345638 0.9342766 0.7821718 0.9591672 0.9760909 0.7833333 0.9457366 0.9226190 0.9457366 0.9760909 0.7648077 0.8797121 0.2066993 0.0690990 0.0671872 0.0700631 0.2407053 0.0438516 0.0513554 0.2534830 0.0619413 0.1684824 0.0619413 0.0513554 0.0455181 0.1312285
#The optimal selected model
selected_model<-fit_svm_raman$results %>% filter(C==as.numeric(fit_svm_raman$bestTune$C))
knitr::kable(selected_model)
sigma C logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
0.4929892 1 0.1606175 0.9823333 0.7433337 0.9284158 0.7682105 0.9551819 0.9681818 0.79 0.9466865 0.8982993 0.9466865 0.9681818 0.7586447 0.8790909 0.1205226 0.0366275 0.0536193 0.0713983 0.2526799 0.0441219 0.0564335 0.2697008 0.0657914 0.1940774 0.0657914 0.0564335 0.0496724 0.1357847
#FTIR SVM CV results
print(paste('The optimal parameters for training the ftir-SVM model are sigma value of',round(as.numeric(fit_svm_ftir$bestTune$sigma),2),'and Cost parameter of',fit_svm_ftir$bestTune$C))
[1] "The optimal parameters for training the ftir-SVM model are sigma value of 0.74 and Cost parameter of 0.25"
#Output FTIR SVM  table
knitr::kable(fit_svm_ftir$results)
sigma C logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
0.7361856 0.25 0.1068539 0.9927424 0.7496635 0.9630861 0.8931236 0.9758508 0.9715455 0.9316667 0.9825303 0.9225000 0.9825303 0.9715455 0.7614377 0.9516061 0.1456756 0.0207046 0.0530076 0.0539722 0.1576123 0.0356911 0.0546014 0.1518590 0.0385234 0.1440238 0.0385234 0.0546014 0.0505374 0.0812399
0.7361856 0.50 0.1033717 0.9920758 0.7498625 0.9630311 0.8913872 0.9759392 0.9724545 0.9283333 0.9818636 0.9248333 0.9818636 0.9724545 0.7621520 0.9503939 0.1431754 0.0220408 0.0530074 0.0530478 0.1592419 0.0347923 0.0527475 0.1611111 0.0404926 0.1400957 0.0404926 0.0527475 0.0493349 0.0842064
0.7361856 1.00 0.1035984 0.9937121 0.7510435 0.9646886 0.8972701 0.9768941 0.9725455 0.9350000 0.9836818 0.9266667 0.9836818 0.9725455 0.7622711 0.9537727 0.1427482 0.0179008 0.0518651 0.0538517 0.1599479 0.0355009 0.0541920 0.1569472 0.0391914 0.1405856 0.0391914 0.0541920 0.0510315 0.0830300
0.7361856 2.00 0.1034399 0.9945455 0.7545334 0.9600092 0.8825657 0.9739893 0.9715455 0.9183333 0.9790455 0.9225000 0.9790455 0.9715455 0.7614377 0.9449394 0.1385220 0.0164742 0.0467754 0.0550690 0.1639875 0.0361778 0.0546014 0.1666582 0.0423833 0.1440238 0.0423833 0.0546014 0.0505374 0.0871020
0.7361856 4.00 0.1073725 0.9938788 0.7529963 0.9614377 0.8859804 0.9749893 0.9733636 0.9183333 0.9790455 0.9265000 0.9790455 0.9733636 0.7628663 0.9458485 0.1391401 0.0181919 0.0488003 0.0542200 0.1628264 0.0354874 0.0524268 0.1666582 0.0423833 0.1404877 0.0423833 0.0524268 0.0491517 0.0871984
0.7361856 8.00 0.1203417 0.9921515 0.7511326 0.9629762 0.8892346 0.9759924 0.9762727 0.9133333 0.9782121 0.9383333 0.9782121 0.9762727 0.7652381 0.9448030 0.1499170 0.0222726 0.0509742 0.0540487 0.1611875 0.0355919 0.0527069 0.1716102 0.0427844 0.1320115 0.0427844 0.0527069 0.0508089 0.0886578
0.7361856 16.00 0.1292030 0.9849697 0.7458000 0.9668315 0.9004737 0.9785797 0.9791818 0.9216667 0.9800303 0.9438333 0.9800303 0.9791818 0.7674908 0.9504242 0.1637244 0.0443712 0.0541867 0.0516818 0.1558122 0.0337178 0.0474809 0.1648980 0.0417162 0.1244213 0.0417162 0.0474809 0.0471666 0.0859045
0.7361856 32.00 0.1299518 0.9793939 0.7416095 0.9660623 0.8992696 0.9779473 0.9781818 0.9216667 0.9800303 0.9433333 0.9800303 0.9781818 0.7667216 0.9499242 0.1693758 0.0586982 0.0607876 0.0538632 0.1592332 0.0355112 0.0517818 0.1648980 0.0417162 0.1294684 0.0417162 0.0517818 0.0500376 0.0862946
0.7361856 64.00 0.1298283 0.9805455 0.7433718 0.9660073 0.8988521 0.9779448 0.9780909 0.9216667 0.9800303 0.9423333 0.9800303 0.9780909 0.7666667 0.9498788 0.1654815 0.0568522 0.0574934 0.0529400 0.1575885 0.0347966 0.0502784 0.1648980 0.0417162 0.1276442 0.0417162 0.0502784 0.0493197 0.0860511
0.7361856 128.00 0.1284680 0.9791515 0.7425314 0.9660623 0.8992696 0.9779473 0.9781818 0.9216667 0.9800303 0.9433333 0.9800303 0.9781818 0.7667216 0.9499242 0.1677677 0.0705617 0.0652824 0.0538632 0.1592332 0.0355112 0.0517818 0.1648980 0.0417162 0.1294684 0.0417162 0.0517818 0.0500376 0.0862946
#The optimal selected model
selected_model<-fit_svm_ftir$results %>% filter(C==as.numeric(fit_svm_ftir$bestTune$C))
knitr::kable(selected_model)
sigma C logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
0.7361856 0.25 0.1068539 0.9927424 0.7496635 0.9630861 0.8931236 0.9758508 0.9715455 0.9316667 0.9825303 0.9225 0.9825303 0.9715455 0.7614377 0.9516061 0.1456756 0.0207046 0.0530076 0.0539722 0.1576123 0.0356911 0.0546014 0.151859 0.0385234 0.1440238 0.0385234 0.0546014 0.0505374 0.0812399
#UV-Vis SVM CV results
print(paste('The optimal parameters for training the uvvis-SVM model are sigma value of',round(as.numeric(fit_svm_uvvis$bestTune$sigma),2),'and Cost parameter of',fit_svm_uvvis$bestTune$C))
[1] "The optimal parameters for training the uvvis-SVM model are sigma value of 0.52 and Cost parameter of 0.5"
#Output UV-Vis SVM  table
knitr::kable(fit_svm_uvvis$results)
sigma C logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
0.5193838 0.25 0.0360344 1 0.7671212 0.9899908 0.9668551 0.9938509 1 0.9550000 0.9882576 1 0.9882576 1 0.7835897 0.9775000 0.0325018 0 0.033415 0.0260374 0.0870871 0.0159925 0 0.1181009 0.0305389 0 0.0305389 0 0.0263793 0.0590504
0.5193838 0.50 0.0333284 1 0.7671212 0.9922344 0.9729164 0.9952795 1 0.9633333 0.9909848 1 0.9909848 1 0.7835897 0.9816667 0.0314328 0 0.033415 0.0234371 0.0833217 0.0142383 0 0.1125662 0.0271913 0 0.0271913 0 0.0263793 0.0562831
0.5193838 1.00 0.0316591 1 0.7671212 0.9899908 0.9668551 0.9938509 1 0.9550000 0.9882576 1 0.9882576 1 0.7835897 0.9775000 0.0317597 0 0.033415 0.0260374 0.0870871 0.0159925 0 0.1181009 0.0305389 0 0.0305389 0 0.0263793 0.0590504
0.5193838 2.00 0.0294402 1 0.7671212 0.9931319 0.9779636 0.9957557 1 0.9700000 0.9918939 1 0.9918939 1 0.7835897 0.9850000 0.0317476 0 0.033415 0.0219555 0.0704254 0.0135697 0 0.0958745 0.0259154 0 0.0259154 0 0.0263793 0.0479372
0.5193838 4.00 0.0275844 1 0.7671212 0.9915842 0.9717998 0.9948447 1 0.9616667 0.9901515 1 0.9901515 1 0.7835897 0.9808333 0.0318035 0 0.033415 0.0240797 0.0815839 0.0147474 0 0.1107443 0.0281715 0 0.0281715 0 0.0263793 0.0553722
0.5193838 8.00 0.0277735 1 0.7671212 0.9884432 0.9606913 0.9929400 1 0.9466667 0.9865152 1 0.9865152 1 0.7835897 0.9733333 0.0336431 0 0.033415 0.0276752 0.0956140 0.0168998 0 0.1294901 0.0322776 0 0.0322776 0 0.0263793 0.0647450
0.5193838 16.00 0.0279007 1 0.7671212 0.9900458 0.9668941 0.9938923 1 0.9550000 0.9883333 1 0.9883333 1 0.7835897 0.9775000 0.0330300 0 0.033415 0.0259003 0.0869918 0.0158890 0 0.1181009 0.0303493 0 0.0303493 0 0.0263793 0.0590504
0.5193838 32.00 0.0274810 1 0.7671212 0.9908791 0.9706441 0.9943685 1 0.9600000 0.9892424 1 0.9892424 1 0.7835897 0.9800000 0.0340679 0 0.033415 0.0248341 0.0798982 0.0153363 0 0.1088662 0.0292949 0 0.0292949 0 0.0263793 0.0544331
0.5193838 64.00 0.0275591 1 0.7671212 0.9899908 0.9668551 0.9938509 1 0.9550000 0.9882576 1 0.9882576 1 0.7835897 0.9775000 0.0328275 0 0.033415 0.0260374 0.0870871 0.0159925 0 0.1181009 0.0305389 0 0.0305389 0 0.0263793 0.0590504
0.5193838 128.00 0.0279922 1 0.7671212 0.9908150 0.9693469 0.9943685 1 0.9583333 0.9892424 1 0.9892424 1 0.7835897 0.9791667 0.0343062 0 0.033415 0.0250185 0.0843673 0.0153363 0 0.1145307 0.0292949 0 0.0292949 0 0.0263793 0.0572654
#The optimal selected model
selected_model<-fit_svm_uvvis$results %>% filter(C==as.numeric(fit_svm_uvvis$bestTune$C))
knitr::kable(selected_model)
sigma C logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
0.5193838 0.5 0.0333284 1 0.7671212 0.9922344 0.9729164 0.9952795 1 0.9633333 0.9909848 1 0.9909848 1 0.7835897 0.9816667 0.0314328 0 0.033415 0.0234371 0.0833217 0.0142383 0 0.1125662 0.0271913 0 0.0271913 0 0.0263793 0.0562831
#GC-MS SVM CV results
print(paste('The optimal parameters for training the gc-SVM model are sigma value of',round(as.numeric(fit_svm_gc$bestTune$sigma),2),'and Cost parameter of',fit_svm_gc$bestTune$C))
[1] "The optimal parameters for training the gc-SVM model are sigma value of 0.39 and Cost parameter of 1"
#Output GC SVM  table
knitr::kable(fit_svm_gc$results)
sigma C logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
0.3890734 0.25 0.1984399 0.9720417 0.7489366 0.9234211 0.7387420 0.9528940 0.9401250 0.8400000 0.9691937 0.7791667 0.9691937 0.9401250 0.7850877 0.8900625 0.1344904 0.0382078 0.0638848 0.0597833 0.2030897 0.0375241 0.0610900 0.2195803 0.0413794 0.1975918 0.0413794 0.0610900 0.0511715 0.1111631
0.3890734 0.50 0.1734756 0.9790417 0.7604047 0.9437719 0.8020220 0.9657785 0.9585000 0.8700000 0.9751331 0.8388333 0.9751331 0.9585000 0.8004386 0.9142500 0.1346872 0.0309371 0.0564803 0.0530237 0.1876449 0.0326710 0.0491579 0.2004204 0.0376937 0.1764479 0.0376937 0.0491579 0.0413333 0.1034032
0.3890734 1.00 0.1601835 0.9805556 0.7649016 0.9597953 0.8533810 0.9758005 0.9744167 0.8866667 0.9784706 0.8971667 0.9784706 0.9744167 0.8137135 0.9305417 0.1379717 0.0299078 0.0515986 0.0440245 0.1629329 0.0265886 0.0382433 0.1786594 0.0334851 0.1497858 0.0334851 0.0382433 0.0318800 0.0917839
0.3890734 2.00 0.1439365 0.9821111 0.7680154 0.9608187 0.8545725 0.9764269 0.9736250 0.8966667 0.9807157 0.8941077 0.9807157 0.9736250 0.8130702 0.9351458 0.1277718 0.0289970 0.0479715 0.0432978 0.1760949 0.0260086 0.0376053 0.1935840 0.0351687 0.1419078 0.0351687 0.0376053 0.0317741 0.0972681
0.3890734 4.00 0.1485790 0.9792778 0.7654093 0.9618713 0.8546747 0.9772870 0.9782083 0.8800000 0.9776127 0.9083333 0.9776127 0.9782083 0.8169006 0.9291042 0.1331861 0.0349321 0.0506155 0.0406391 0.1659752 0.0239520 0.0312190 0.1983094 0.0364846 0.1347879 0.0364846 0.0312190 0.0266266 0.0993475
0.3890734 8.00 0.1594648 0.9742500 0.7588746 0.9487135 0.8092786 0.9692424 0.9689583 0.8466667 0.9711169 0.8721667 0.9711169 0.9689583 0.8091813 0.9078125 0.1370649 0.0415489 0.0540039 0.0479098 0.1835749 0.0287925 0.0403294 0.2087857 0.0386486 0.1621450 0.0386486 0.0403294 0.0342296 0.1060021
0.3890734 16.00 0.1713354 0.9698611 0.7566058 0.9410819 0.7735999 0.9649034 0.9676667 0.8066667 0.9641501 0.8598333 0.9641501 0.9676667 0.8080994 0.8871667 0.1364650 0.0470989 0.0538172 0.0485667 0.2017312 0.0287359 0.0403501 0.2377410 0.0433197 0.1850637 0.0433197 0.0403501 0.0341691 0.1180696
0.3890734 32.00 0.1998490 0.9552361 0.7393998 0.9295906 0.7332233 0.9578587 0.9590833 0.7800000 0.9589455 0.8256667 0.9589455 0.9590833 0.8009357 0.8695417 0.1478541 0.0608609 0.0651100 0.0538075 0.2143873 0.0323817 0.0468433 0.2471839 0.0448458 0.1984426 0.0448458 0.0468433 0.0395880 0.1235660
0.3890734 64.00 0.2053319 0.9587222 0.7429566 0.9273684 0.7213881 0.9565170 0.9584167 0.7700000 0.9573272 0.8217687 0.9573272 0.9584167 0.8003801 0.8642083 0.1636987 0.0574658 0.0616988 0.0564916 0.2349347 0.0342355 0.0504260 0.2624883 0.0469374 0.2028916 0.0469374 0.0504260 0.0425572 0.1304670
0.3890734 128.00 0.2231250 0.9513056 0.7353020 0.9301170 0.7349705 0.9581873 0.9591250 0.7833333 0.9594697 0.8193603 0.9594697 0.9591250 0.8009649 0.8712292 0.1740380 0.0686545 0.0700855 0.0542389 0.2244704 0.0323520 0.0458532 0.2478867 0.0449145 0.2015207 0.0449145 0.0458532 0.0386555 0.1246005
#The optimal selected model
selected_model<-fit_svm_gc$results %>% filter(C==as.numeric(fit_svm_gc$bestTune$C))
knitr::kable(selected_model)
sigma C logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
0.3890734 1 0.1601835 0.9805556 0.7649016 0.9597953 0.853381 0.9758005 0.9744167 0.8866667 0.9784706 0.8971667 0.9784706 0.9744167 0.8137135 0.9305417 0.1379717 0.0299078 0.0515986 0.0440245 0.1629329 0.0265886 0.0382433 0.1786594 0.0334851 0.1497858 0.0334851 0.0382433 0.03188 0.0917839
Plot the SVM CV Models
#HSI CV Plot
p1<-ggplot(fit_svm_hsi)+geom_line(colour = "red")+ 
  theme_bw()+
  theme(
  panel.grid = element_blank())+
  labs(title ='HSI SVM Model Training',y = "Accuracy")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.title = element_text(size =8),
        axis.text = element_text(colour = "black"),
        aspect.ratio = 1)

 #Raman CV Plot
p2<-ggplot(fit_svm_raman)+geom_line(colour = "blue")+ 
  theme_bw()+
  theme(
  panel.grid = element_blank())+
  labs(title ='Raman SVM Model Training',y = "Accuracy")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.title = element_text(size =8),
        axis.text = element_text(colour = "black",size = 8),
        aspect.ratio = 1)

#FTIR CV Plot
p3<-ggplot(fit_svm_ftir)+geom_line(colour = "black")+ 
  theme_bw()+
  theme(
  panel.grid = element_blank())+
  labs(title ='FTIR SVM Model Training', y = "Accuracy")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.title = element_text(size =8),
        axis.text = element_text(colour = "black",size = 8),
        aspect.ratio = 1)

#UV-Vis CV Plot
p4<-ggplot(fit_svm_uvvis)+geom_line(colour = "black",linetype = 'dashed')+ 
  theme_bw()+
  theme(
  panel.grid = element_blank())+
  labs(title ='UV-Vis SVM Model Training',y = "Accuracy")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.title = element_text(size =8),
        axis.text = element_text(colour = "black",size =8),
        aspect.ratio = 1)

#Arrange the svm training model plots

grid.arrange(p1,p2,p3,p4,nrow = 2)

#GC-MS SVM CV Plot
ggplot(fit_svm_gc)+geom_line(colour = "black",linetype = 'dashed')+ 
  theme_bw()+
  theme(
  panel.grid = element_blank())+
  labs(title ='GC-MS SVM Model Training',y = "Accuracy")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.title = element_text(size =8),
        axis.text = element_text(colour = "black",size =8),
        aspect.ratio = 1)

Test SVM Models

Hyperspectral Imaging SVM Test Results
#Predict HSI test set
test_hsi_svm<-predict(fit_svm_hsi,newdata=data_test_hsi_cf)

#get the confusion matrix
cfmatrix_hsi<-confusionMatrix(test_hsi_svm,data_test_hsi_cf$class_1)

#print the confusion matrix
print(cfmatrix_hsi)
Confusion Matrix and Statistics

             Reference
Prediction    Adulterated Pure_EVOO
  Adulterated          43         3
  Pure_EVOO             0         8
                                          
               Accuracy : 0.9444          
                 95% CI : (0.8461, 0.9884)
    No Information Rate : 0.7963          
    P-Value [Acc > NIR] : 0.002383        
                                          
                  Kappa : 0.8094          
                                          
 Mcnemar's Test P-Value : 0.248213        
                                          
            Sensitivity : 1.0000          
            Specificity : 0.7273          
         Pos Pred Value : 0.9348          
         Neg Pred Value : 1.0000          
             Prevalence : 0.7963          
         Detection Rate : 0.7963          
   Detection Prevalence : 0.8519          
      Balanced Accuracy : 0.8636          
                                          
       'Positive' Class : Adulterated     
                                          
knitr::kable(cfmatrix_hsi$byClass)
x
Sensitivity 1.0000000
Specificity 0.7272727
Pos Pred Value 0.9347826
Neg Pred Value 1.0000000
Precision 0.9347826
Recall 1.0000000
F1 0.9662921
Prevalence 0.7962963
Detection Rate 0.7962963
Detection Prevalence 0.8518519
Balanced Accuracy 0.8636364
#View the results as knitr table
knitr::kable(cfmatrix_hsi$table)
Adulterated Pure_EVOO
Adulterated 43 3
Pure_EVOO 0 8
#Calculating Matthews Correlation Coefficient (MC)
# Extracting the components of the confusion matrix
cfmatrix_hsi_svm_table <- cfmatrix_hsi$table
TP <- cfmatrix_hsi_svm_table[1,1] 
TN <- cfmatrix_hsi_svm_table[2,2] 
FP <- cfmatrix_hsi_svm_table[2,1] 
FN <- cfmatrix_hsi_svm_table[1,2] 

# Calculating MCC
MCC <- (TP * TN - FP * FN) / sqrt((TP + FP) * (TP + FN) * (TN + FP) * (TN + FN))

# Printing the MCC value
print(paste("The MCC value is for the model",MCC))
[1] "The MCC value is for the model 0.824525255667285"
Test SVM Raman Model on Test Data
#Predict Raman test set
test_raman_svm<-predict(fit_svm_raman,newdata=data_test_raman_cf)

#get the confusion matrix
cfmatrix_raman<-confusionMatrix(test_raman_svm,data_test_raman_cf$class_1)

#print the confusion matrix
print(cfmatrix_raman)
Confusion Matrix and Statistics

             Reference
Prediction    Adulterated Pure_EVOO
  Adulterated          41         4
  Pure_EVOO             2         7
                                          
               Accuracy : 0.8889          
                 95% CI : (0.7737, 0.9581)
    No Information Rate : 0.7963          
    P-Value [Acc > NIR] : 0.05725         
                                          
                  Kappa : 0.6327          
                                          
 Mcnemar's Test P-Value : 0.68309         
                                          
            Sensitivity : 0.9535          
            Specificity : 0.6364          
         Pos Pred Value : 0.9111          
         Neg Pred Value : 0.7778          
             Prevalence : 0.7963          
         Detection Rate : 0.7593          
   Detection Prevalence : 0.8333          
      Balanced Accuracy : 0.7949          
                                          
       'Positive' Class : Adulterated     
                                          
knitr::kable(cfmatrix_raman$byClass)
x
Sensitivity 0.9534884
Specificity 0.6363636
Pos Pred Value 0.9111111
Neg Pred Value 0.7777778
Precision 0.9111111
Recall 0.9534884
F1 0.9318182
Prevalence 0.7962963
Detection Rate 0.7592593
Detection Prevalence 0.8333333
Balanced Accuracy 0.7949260
#View the results as knitr table
knitr::kable(cfmatrix_raman$table)
Adulterated Pure_EVOO
Adulterated 41 4
Pure_EVOO 2 7
#Calculating Matthews Correlation Coefficient (MC)
# Extracting the components of the confusion matrix
cfmatrix_raman_svm_table <- cfmatrix_raman$table
TP <- cfmatrix_raman_svm_table[1,1] 
TN <- cfmatrix_raman_svm_table[2,2] 
FP <- cfmatrix_raman_svm_table[2,1] 
FN <- cfmatrix_raman_svm_table[1,2] 

# Calculating MCC
MCC <- (TP * TN - FP * FN) / sqrt((TP + FP) * (TP + FN) * (TN + FP) * (TN + FN))

# Printing the MCC value
print(paste("The MCC value is for the model",round(MCC,2)))
[1] "The MCC value is for the model 0.64"
FTIR Spectroscopy SVM Test Results
#Predict FTIR test set
test_ftir_svm<-predict(fit_svm_ftir,newdata=data_test_ftir_cf)

#get the confusion matrix
cfmatrix_ftir<-confusionMatrix(test_ftir_svm,data_test_ftir_cf$class_1)

#print the confusion matrix
print(cfmatrix_ftir)
Confusion Matrix and Statistics

             Reference
Prediction    Adulterated Pure_EVOO
  Adulterated          42         0
  Pure_EVOO             1        11
                                          
               Accuracy : 0.9815          
                 95% CI : (0.9011, 0.9995)
    No Information Rate : 0.7963          
    P-Value [Acc > NIR] : 6.741e-05       
                                          
                  Kappa : 0.9448          
                                          
 Mcnemar's Test P-Value : 1               
                                          
            Sensitivity : 0.9767          
            Specificity : 1.0000          
         Pos Pred Value : 1.0000          
         Neg Pred Value : 0.9167          
             Prevalence : 0.7963          
         Detection Rate : 0.7778          
   Detection Prevalence : 0.7778          
      Balanced Accuracy : 0.9884          
                                          
       'Positive' Class : Adulterated     
                                          
knitr::kable(cfmatrix_ftir$byClass)
x
Sensitivity 0.9767442
Specificity 1.0000000
Pos Pred Value 1.0000000
Neg Pred Value 0.9166667
Precision 1.0000000
Recall 0.9767442
F1 0.9882353
Prevalence 0.7962963
Detection Rate 0.7777778
Detection Prevalence 0.7777778
Balanced Accuracy 0.9883721
#View the results as knitr table
knitr::kable(cfmatrix_ftir$table)
Adulterated Pure_EVOO
Adulterated 42 0
Pure_EVOO 1 11
#Calculating Matthews Correlation Coefficient (MC)
# Extracting the components of the confusion matrix
cfmatrix_ftir_svm_table <- cfmatrix_ftir$table
TP <- cfmatrix_ftir_svm_table[1,1] 
TN <- cfmatrix_ftir_svm_table[2,2] 
FP <- cfmatrix_ftir_svm_table[2,1] 
FN <- cfmatrix_ftir_svm_table[1,2] 

# Calculating MCC
MCC <- (TP * TN - FP * FN) / sqrt((TP + FP) * (TP + FN) * (TN + FP) * (TN + FN))

# Printing the MCC value
print(paste("The MCC value is for the model",MCC))
[1] "The MCC value is for the model 0.946228744653904"
Assess UV-Vis SVM Model on Test Data
#Predict uvvis test set
test_uvvis_svm<-predict(fit_svm_uvvis,newdata=data_test_uvvis_cf)

#get the confusion matrix
cfmatrix_uvvis<-confusionMatrix(test_uvvis_svm,data_test_uvvis_cf$class_1)

#print the confusion matrix
print(cfmatrix_uvvis)
Confusion Matrix and Statistics

             Reference
Prediction    Adulterated Pure_EVOO
  Adulterated          43         5
  Pure_EVOO             0         6
                                         
               Accuracy : 0.9074         
                 95% CI : (0.797, 0.9692)
    No Information Rate : 0.7963         
    P-Value [Acc > NIR] : 0.02431        
                                         
                  Kappa : 0.6565         
                                         
 Mcnemar's Test P-Value : 0.07364        
                                         
            Sensitivity : 1.0000         
            Specificity : 0.5455         
         Pos Pred Value : 0.8958         
         Neg Pred Value : 1.0000         
             Prevalence : 0.7963         
         Detection Rate : 0.7963         
   Detection Prevalence : 0.8889         
      Balanced Accuracy : 0.7727         
                                         
       'Positive' Class : Adulterated    
                                         
knitr::kable(cfmatrix_uvvis$byClass)
x
Sensitivity 1.0000000
Specificity 0.5454545
Pos Pred Value 0.8958333
Neg Pred Value 1.0000000
Precision 0.8958333
Recall 1.0000000
F1 0.9450549
Prevalence 0.7962963
Detection Rate 0.7962963
Detection Prevalence 0.8888889
Balanced Accuracy 0.7727273
#Calculating Matthews Correlation Coefficient (MC)
# Extracting the components of the confusion matrix
cfmatrix_uvvis_svm_table <- cfmatrix_uvvis$table
TP <- cfmatrix_uvvis_svm_table[1,1] 
TN <- cfmatrix_uvvis_svm_table[2,2] 
FP <- cfmatrix_uvvis_svm_table[2,1] 
FN <- cfmatrix_uvvis_svm_table[1,2] 

# Calculating MCC
MCC <- (TP * TN - FP * FN) / sqrt((TP + FP) * (TP + FN) * (TN + FP) * (TN + FN))

# Printing the MCC value
print(paste("The MCC value is for the model",round(MCC,2)))
[1] "The MCC value is for the model 0.7"
Assess GC-MS SVM Model on Test Data
#Predict gc test set
test_gc_svm<-predict(fit_svm_gc,newdata=data_test_gc_cf)

#get the confusion matrix
cfmatrix_gc<-confusionMatrix(test_gc_svm,data_test_gc_cf$class_1)

#print the confusion matrix
print(cfmatrix_gc)
Confusion Matrix and Statistics

             Reference
Prediction    Adulterated Pure_EVOO
  Adulterated          62         2
  Pure_EVOO             2        10
                                          
               Accuracy : 0.9474          
                 95% CI : (0.8707, 0.9855)
    No Information Rate : 0.8421          
    P-Value [Acc > NIR] : 0.004605        
                                          
                  Kappa : 0.8021          
                                          
 Mcnemar's Test P-Value : 1.000000        
                                          
            Sensitivity : 0.9688          
            Specificity : 0.8333          
         Pos Pred Value : 0.9688          
         Neg Pred Value : 0.8333          
             Prevalence : 0.8421          
         Detection Rate : 0.8158          
   Detection Prevalence : 0.8421          
      Balanced Accuracy : 0.9010          
                                          
       'Positive' Class : Adulterated     
                                          
knitr::kable(cfmatrix_gc$byClass)
x
Sensitivity 0.9687500
Specificity 0.8333333
Pos Pred Value 0.9687500
Neg Pred Value 0.8333333
Precision 0.9687500
Recall 0.9687500
F1 0.9687500
Prevalence 0.8421053
Detection Rate 0.8157895
Detection Prevalence 0.8421053
Balanced Accuracy 0.9010417
#Calculating Matthews Correlation Coefficient (MC)
# Extracting the components of the confusion matrix
cfmatrix_gc_svm_table <- cfmatrix_gc$table
TP <- cfmatrix_gc_svm_table[1,1] 
TN <- cfmatrix_gc_svm_table[2,2] 
FP <- cfmatrix_gc_svm_table[2,1] 
FN <- cfmatrix_gc_svm_table[1,2] 

# Calculating MCC
MCC <- (TP * TN - FP * FN) / sqrt((TP + FP) * (TP + FN) * (TN + FP) * (TN + FN))

# Printing the MCC value
print(paste("The MCC value is for the model",round(MCC,2)))
[1] "The MCC value is for the model 0.8"
Plot Confusion Matrix Tables for SVM Binary Classification Algorithm
# Plotting the confusion matrix

#HSI SVM confusion Matrix
cf_hsi_svm<-ggplot(data = as.data.frame(cfmatrix_hsi$table), aes(x = Reference, y = Prediction)) +
  geom_tile(aes(fill = Freq), color = "black") +
  geom_text(aes(label = sprintf("%0.0f", Freq)), vjust = 1,color = "black",size = 4) +
  scale_fill_gradient(low = "white", high = "#99ccff", name = "Frequency") +
  theme_minimal() +
  labs(x = "Actual Class", y = "Predicted Class", color = 'black',title = 'Confusion Matrix HSI SVM')+
  theme(
    legend.position =  "none",
    axis.text.y = element_text(color = "black",size = 8),
    axis.title.x  = element_text(size = 8),
    axis.title.y = element_text(size = 8),aspect.ratio = 1,
    plot.title = element_text(size = 8,hjust = 0.5))

#Raman SVM confusion Matrix
cf_raman_svm<-ggplot(data = as.data.frame(cfmatrix_raman$table), aes(x = Reference, y = Prediction)) +
  geom_tile(aes(fill = Freq), color = "black") +
  geom_text(aes(label = sprintf("%0.0f", Freq)), vjust = 1,color = "black",size = 4) +
  scale_fill_gradient(low = "gray84", high = "darkorange3", name = "Frequency") +
  theme_minimal() +
  labs(x = "Actual Class", y = "Predicted Class", color = 'black',title = 'Confusion Matrix Raman SVM')+
  theme(
    legend.position =  "none",
    axis.text.y = element_text(color = "black",size = 8),
    axis.title.x  = element_text(size = 8),
    axis.title.y = element_text(size = 8),aspect.ratio = 1,
    plot.title = element_text(size = 8,hjust = 0.5))

#FTIR SVM confusion Matrix
cf_ftir_svm<-ggplot(data = as.data.frame(cfmatrix_ftir$table), aes(x = Reference, y = Prediction)) +
  geom_tile(aes(fill = Freq), color = "black") +
  geom_text(aes(label = sprintf("%0.0f", Freq)), vjust = 1,color = "black",size = 4) +
  scale_fill_gradient(low = "gray84", high = "darkseagreen2", name = "Frequency") +
  theme_minimal() +
  labs(x = "Actual Class", y = "Predicted Class", color = 'black',title = 'Confusion Matrix FTIR SVM')+
  theme(
    legend.position =  "none",
    axis.text.y = element_text(color = "black",size = 8),
    axis.title.x  = element_text(size = 8),
    axis.title.y = element_text(size = 8),aspect.ratio = 1,
    plot.title = element_text(size = 8,hjust = 0.5))

#UV-Vis SVM confusion Matrix
cf_uvvis_svm<-ggplot(data = as.data.frame(cfmatrix_uvvis$table), aes(x = Reference, y = Prediction)) +
  geom_tile(aes(fill = Freq), color = "black") +
  geom_text(aes(label = sprintf("%0.0f", Freq)), vjust = 1,color = "black",size = 4) +
  scale_fill_gradient(low = "azure1", high = "turquoise", name = "Frequency") +
  theme_minimal() +
  labs(x = "Actual Class", y = "Predicted Class", color = 'black',title = 'Confusion Matrix UV-Vis SVM')+
  theme(
    legend.position =  "none",
    axis.text.y = element_text(color = "black",size = 8),
    axis.title.x  = element_text(size = 8),
    axis.title.y = element_text(size = 8),aspect.ratio = 1,
    plot.title = element_text(size = 8,hjust = 0.5))
library(grid)
grid.arrange(cf_hsi_svm,cf_raman_svm,cf_ftir_svm,cf_uvvis_svm,nrow = 2)

#GC-MS SVM confusion Matrix
ggplot(data = as.data.frame(cfmatrix_gc$table), aes(x = Reference, y = Prediction)) +
  geom_tile(aes(fill = Freq), color = "black") +
  geom_text(aes(label = sprintf("%0.0f", Freq)), vjust = 1,color = "black",size = 4) +
  scale_fill_gradient(low = "azure1", high = "tan1", name = "Frequency") +
  theme_minimal() +
  labs(x = "Actual Class", y = "Predicted Class", color = 'black',title = 'Confusion Matrix GC-MS SVM')+
  theme(
    legend.position =  "none",
    axis.text.y = element_text(color = "black",size = 8),
    axis.title.x  = element_text(size = 8),
    axis.title.y = element_text(size = 8),aspect.ratio = 1,
    plot.title = element_text(size = 8,hjust = 0.5))

Model 5. Artificial Neural Networks (ANN)

  • Artificial Neural Networks (ANN) are computational models inspired by the human brain, designed to recognize patterns and solve complex problems in machine learning tasks such as classification and regression. They consist of interconnected layers of nodes (neurons) that process and transform input data through weights and activation functions.
#Register cluster for caret to train the models in parallel
cl<-makeCluster(6,type = "SOCK")
suppressWarnings(suppressMessages(
  registerDoSNOW(cl)))

#start_time
start_time<-Sys.time()

#Set tune grid to find optimal value of size and decay on cross-validation
#size is the number of number of units (neurons) in the hidden layer of the neural network
#decay is the regularization parameter; adds penalty to the loss function

grid <- expand.grid(.size=seq(1,10, by =1),.decay=c(0,0.001,0.01,0.1))

#Train ANN models for all the techniques

#HSI ANN model

fit_nnet_hsi<-train(class_1~.,data = data_train_hsi_cf,
                   method = "nnet",tuneGrid = grid, trControl = control,
                   metric = metric,trace = FALSE, MaxNWts = 1000, maxit = 200)

#Raman ANN model
fit_nnet_raman<-train(class_1~.,data = data_train_raman_cf,
                   method = "nnet",tuneGrid = grid, trControl = control,
                   metric = metric,trace = FALSE, MaxNWts = 1000, maxit = 200)
Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo,
: There were missing values in resampled performance measures.
#FTIR ANN model
fit_nnet_ftir<-train(class_1~.,data = data_train_ftir_cf,
                   method = "nnet",tuneGrid = grid, trControl = control,
                   metric = metric,trace = FALSE, MaxNWts = 1000, maxit = 200)
Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo,
: There were missing values in resampled performance measures.
#UV-Vis ANN model
fit_nnet_uvvis<-train(class_1~.,data = data_train_uvvis_cf,
                   method = "nnet",tuneGrid = grid, trControl = control,
                   metric = metric,trace = FALSE, MaxNWts = 1000, maxit = 200)
#GC-MS ANN model
fit_nnet_gc<-train(class_1~.,data = data_train_gc_cf,
                 method = "nnet",tuneGrid = grid, trControl = control,
                   metric = metric,trace = FALSE, MaxNWts = 1000, maxit = 200)
Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo,
: There were missing values in resampled performance measures.
#End_time
end_time<-Sys.time()
model_training_time<-end_time-start_time
print(model_training_time)
Time difference of 4.274854 mins
stopCluster(cl)#stop the parallel run cluster
Plot the NNET CV Models
#HSI CV Plot
p1<-ggplot(fit_nnet_hsi)+geom_line(colour = "red")+ 
  theme_bw()+
  theme(
  panel.grid = element_blank())+
  labs(title ='HSI NNET Model Training',y = "Accuracy")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.title = element_text(size =8),
        axis.text = element_text(colour = "black"),
        aspect.ratio = 1)

 #Raman CV Plot
p2<-ggplot(fit_nnet_raman)+geom_line(colour = "blue")+ 
  theme_bw()+
  theme(
  panel.grid = element_blank())+
  labs(title ='Raman NNET Model Training',y = "Accuracy")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.title = element_text(size =8),
        axis.text = element_text(colour = "black",size = 8),
        aspect.ratio = 1)

#FTIR CV Plot
p3<-ggplot(fit_nnet_ftir)+geom_line(colour = "black")+ 
  theme_bw()+
  theme(
  panel.grid = element_blank())+
  labs(title ='FTIR NNET Model Training', y = "Accuracy")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.title = element_text(size =8),
        axis.text = element_text(colour = "black",size = 8),
        aspect.ratio = 1)

#UV-Vis CV Plot
p4<-ggplot(fit_nnet_uvvis)+geom_line(colour = "black",linetype = 'dashed')+ 
  theme_bw()+
  theme(
  panel.grid = element_blank())+
  labs(title ='UV-Vis NNET Model Training',y = "Accuracy")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.title = element_text(size =8),
        axis.text = element_text(colour = "black",size =8),
        aspect.ratio = 1)

#Arrange the nnet training model plots

grid.arrange(p1,p2,p3,p4,nrow = 2)

#GC-MS NNET CV Plot
ggplot(fit_nnet_gc)+geom_line(colour = "black",linetype = 'dashed')+ 
  theme_bw()+
  theme(
  panel.grid = element_blank())+
  labs(title ='GC-MS NNET Model Training',y = "Accuracy")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.title = element_text(size =8),
        axis.text = element_text(colour = "black",size =8),
        aspect.ratio = 1)

Display NNET cross-validation results
#HSI NNET CV results
print(paste('The optimal parameters for training the HSI-nnet model are',round(as.numeric(fit_nnet_hsi$bestTune$size),2),'neurons','and decay value of',fit_nnet_hsi$bestTune$decay))
[1] "The optimal parameters for training the HSI-nnet model are 2 neurons and decay value of 0.1"
#Output HSI NNET table
knitr::kable(fit_nnet_hsi$results)
size decay logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
1 0.000 0.1855669 0.9487727 0.2101282 0.9106410 0.8119190 0.9309917 0.8856364 1.0000000 1.0000000 0.7997143 1.0000000 0.8856364 0.6942308 0.9428182 0.2321778 0.0741281 0.1966494 0.1227579 0.2392181 0.1018552 0.1573965 0.0000000 0.0000000 0.2413683 0.0000000 0.1573965 0.1266202 0.0786982
1 0.001 0.0227302 0.9913636 0.7583139 0.9931044 0.9825044 0.9953099 0.9913636 1.0000000 1.0000000 0.9786667 1.0000000 0.9913636 0.7766941 0.9956818 0.0919581 0.0360495 0.0447211 0.0288881 0.0727562 0.0197091 0.0360495 0.0000000 0.0000000 0.0876517 0.0000000 0.0360495 0.0362150 0.0180247
1 0.010 0.0171582 0.9950000 0.7618465 0.9961538 0.9910327 0.9972515 0.9950000 1.0000000 1.0000000 0.9895000 1.0000000 0.9950000 0.7797436 0.9975000 0.0721078 0.0297294 0.0403210 0.0228688 0.0528483 0.0164243 0.0297294 0.0000000 0.0000000 0.0612558 0.0000000 0.0297294 0.0364749 0.0148647
1 0.100 0.0367869 0.9966061 0.7636910 0.9962546 0.9908930 0.9973684 0.9952727 1.0000000 1.0000000 0.9891667 1.0000000 0.9952727 0.7798443 0.9976364 0.0687512 0.0280102 0.0400897 0.0240742 0.0559249 0.0173434 0.0305084 0.0000000 0.0000000 0.0645008 0.0000000 0.0305084 0.0347103 0.0152542
2 0.000 0.1576879 0.9884394 0.1577909 0.9846612 0.9603138 0.9893735 0.9835455 0.9883333 0.9973485 0.9632857 0.9973485 0.9835455 0.7705678 0.9859394 0.5560786 0.0414733 0.1411936 0.0463360 0.1113570 0.0339670 0.0567314 0.0680620 0.0151659 0.1146685 0.0151659 0.0567314 0.0498405 0.0431945
2 0.001 0.0007776 1.0000000 0.7671212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835897 1.0000000 0.0012274 0.0000000 0.0334150 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0263793 0.0000000
2 0.010 0.0047796 1.0000000 0.7666212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835897 1.0000000 0.0065152 0.0000000 0.0335438 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0263793 0.0000000
2 0.100 0.0225697 1.0000000 0.7671212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835897 1.0000000 0.0135114 0.0000000 0.0334150 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0263793 0.0000000
3 0.000 0.1668731 0.9915909 0.1453921 0.9845879 0.9585386 0.9897014 0.9842727 0.9866667 0.9964394 0.9603333 0.9964394 0.9842727 0.7711996 0.9854697 0.5478031 0.0299314 0.1191292 0.0363976 0.0961067 0.0246135 0.0434134 0.0656488 0.0175436 0.1070673 0.0175436 0.0434134 0.0417520 0.0379831
3 0.001 0.0006894 1.0000000 0.7671212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835897 1.0000000 0.0016104 0.0000000 0.0334150 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0263793 0.0000000
3 0.010 0.0034037 1.0000000 0.7671212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835897 1.0000000 0.0043668 0.0000000 0.0334150 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0263793 0.0000000
3 0.100 0.0184395 1.0000000 0.7671212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835897 1.0000000 0.0136818 0.0000000 0.0334150 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0263793 0.0000000
4 0.000 0.1189303 0.9965000 0.1449901 0.9847253 0.9610657 0.9891642 0.9860909 0.9800000 0.9946212 0.9712500 0.9946212 0.9860909 0.7728755 0.9830455 0.4172925 0.0258736 0.1305624 0.0462453 0.1011356 0.0372035 0.0567573 0.0795611 0.0214084 0.0937689 0.0214084 0.0567573 0.0537666 0.0474062
4 0.001 0.0006745 1.0000000 0.7671212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835897 1.0000000 0.0014099 0.0000000 0.0334150 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0263793 0.0000000
4 0.010 0.0031031 1.0000000 0.7671212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835897 1.0000000 0.0042698 0.0000000 0.0334150 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0263793 0.0000000
4 0.100 0.0156716 1.0000000 0.7671212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835897 1.0000000 0.0122031 0.0000000 0.0334150 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0263793 0.0000000
5 0.000 0.0512670 0.9995000 0.2117765 0.9937821 0.9828928 0.9958897 0.9930909 0.9966667 0.9990909 0.9808333 0.9990909 0.9930909 0.7781410 0.9948788 0.1910487 0.0050000 0.1489707 0.0212009 0.0587582 0.0140224 0.0253243 0.0333333 0.0090909 0.0709236 0.0090909 0.0253243 0.0320194 0.0206513
5 0.001 0.0007024 1.0000000 0.7671212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835897 1.0000000 0.0016675 0.0000000 0.0334150 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0263793 0.0000000
5 0.010 0.0030894 1.0000000 0.7671212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835897 1.0000000 0.0043887 0.0000000 0.0334150 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0263793 0.0000000
5 0.100 0.0149449 1.0000000 0.7671212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835897 1.0000000 0.0126818 0.0000000 0.0334150 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0263793 0.0000000
6 0.000 0.0740915 0.9990455 0.1877118 0.9922344 0.9777635 0.9948872 0.9920909 0.9916667 0.9981818 0.9791667 0.9981818 0.9920909 0.7774267 0.9918788 0.3685848 0.0067232 0.1552132 0.0234371 0.0693054 0.0154329 0.0269697 0.0598117 0.0127914 0.0714361 0.0127914 0.0269697 0.0344220 0.0322942
6 0.001 0.0007628 1.0000000 0.7671212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835897 1.0000000 0.0014327 0.0000000 0.0334150 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0263793 0.0000000
6 0.010 0.0031444 1.0000000 0.7671212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835897 1.0000000 0.0049387 0.0000000 0.0334150 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0263793 0.0000000
6 0.100 0.0137432 1.0000000 0.7671212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835897 1.0000000 0.0120367 0.0000000 0.0334150 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0263793 0.0000000
7 0.000 0.0632670 0.9995000 0.2095635 0.9961538 0.9897860 0.9974185 0.9960000 0.9966667 0.9990909 0.9900000 0.9990909 0.9960000 0.7805128 0.9963333 0.3168089 0.0050000 0.1474251 0.0168495 0.0449828 0.0113180 0.0196946 0.0333333 0.0090909 0.0492366 0.0090909 0.0196946 0.0318533 0.0191837
7 0.001 0.0009357 1.0000000 0.7637576 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835897 1.0000000 0.0020368 0.0000000 0.0424721 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0263793 0.0000000
7 0.010 0.0032434 1.0000000 0.7671212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835897 1.0000000 0.0049146 0.0000000 0.0334150 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0263793 0.0000000
7 0.100 0.0135615 1.0000000 0.7671212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835897 1.0000000 0.0125787 0.0000000 0.0334150 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0263793 0.0000000
8 0.000 0.0404420 1.0000000 0.2133788 0.9939011 0.9840135 0.9958897 0.9930909 0.9966667 0.9990909 0.9825000 0.9990909 0.9930909 0.7782601 0.9948788 0.1775534 0.0000000 0.1461740 0.0207930 0.0547063 0.0140224 0.0253243 0.0333333 0.0090909 0.0641081 0.0090909 0.0253243 0.0347584 0.0206513
8 0.001 0.0008920 1.0000000 0.7581212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835897 1.0000000 0.0020529 0.0000000 0.0510751 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0263793 0.0000000
8 0.010 0.0032225 1.0000000 0.7671212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835897 1.0000000 0.0050578 0.0000000 0.0334150 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0263793 0.0000000
8 0.100 0.0136790 1.0000000 0.7671212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835897 1.0000000 0.0131323 0.0000000 0.0334150 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0263793 0.0000000
9 0.000 0.0250730 1.0000000 0.2095758 0.9960256 0.9879291 0.9974185 0.9960000 0.9950000 0.9990909 0.9891667 0.9990909 0.9960000 0.7804487 0.9955000 0.1287829 0.0000000 0.1555227 0.0174254 0.0551854 0.0113180 0.0196946 0.0500000 0.0090909 0.0538305 0.0090909 0.0196946 0.0303076 0.0266809
9 0.001 0.0008912 1.0000000 0.7462576 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835897 1.0000000 0.0019277 0.0000000 0.0623767 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0263793 0.0000000
9 0.010 0.0033323 1.0000000 0.7671212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835897 1.0000000 0.0053609 0.0000000 0.0334150 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0263793 0.0000000
9 0.100 0.0130209 1.0000000 0.7666212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835897 1.0000000 0.0125511 0.0000000 0.0335438 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0263793 0.0000000
10 0.000 0.0493044 0.9995000 0.2021090 0.9952564 0.9872388 0.9968421 0.9940000 1.0000000 1.0000000 0.9833333 1.0000000 0.9940000 0.7788462 0.9970000 0.2900285 0.0050000 0.1421554 0.0188849 0.0511786 0.0125623 0.0238683 0.0000000 0.0000000 0.0670025 0.0000000 0.0238683 0.0312134 0.0119342
10 0.001 0.0010461 1.0000000 0.7088030 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835897 1.0000000 0.0024055 0.0000000 0.0798484 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0263793 0.0000000
10 0.010 0.0033862 1.0000000 0.7671212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835897 1.0000000 0.0052259 0.0000000 0.0334150 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0263793 0.0000000
10 0.100 0.0127581 1.0000000 0.7666212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835897 1.0000000 0.0124877 0.0000000 0.0335438 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0263793 0.0000000
#The optimal selected model
selected_model <- fit_nnet_hsi$results %>% 
  filter(size == fit_nnet_hsi$bestTune$size & decay == fit_nnet_hsi$bestTune$decay)

knitr::kable(selected_model)
size decay logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
2 0.1 0.0225697 1 0.7671212 1 1 1 1 1 1 1 1 1 0.7835897 1 0.0135114 0 0.033415 0 0 0 0 0 0 0 0 0 0.0263793 0
#Rama NNET CV results
print(paste('The optimal parameters for training the Raman-nnet model are',round(as.numeric(fit_nnet_raman$bestTune$size),2),'neurons','and decay value of',fit_nnet_raman$bestTune$decay))
[1] "The optimal parameters for training the Raman-nnet model are 1 neurons and decay value of 0.1"
#Output Raman NNET table
knitr::kable(fit_nnet_raman$results)
size decay logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
1 0.000 0.3851384 0.9554091 0.1257998 0.9257234 0.8322536 0.9446027 0.9106364 0.9800000 0.9951136 0.8302381 0.9951136 0.9106364 0.7132875 0.9453182 0.6560436 0.0730512 0.1487231 0.1093760 0.2207944 0.0876696 0.1384930 0.0895180 0.0216247 0.2236391 0.0216247 0.1384930 0.1094360 0.0800052
1 0.001 0.2107471 0.9933636 0.4876566 0.9650916 0.8964649 0.9774026 0.9713636 0.9433333 0.9861888 0.9251667 0.9861888 0.9713636 0.7610073 0.9573485 0.3115456 0.0200624 0.1514537 0.0484752 0.1519857 0.0311744 0.0510083 0.1627830 0.0394460 0.1322662 0.0394460 0.0510083 0.0453888 0.0803460
1 0.010 0.1363598 0.9898636 0.6712959 0.9665110 0.9039479 0.9780887 0.9683636 0.9600000 0.9901970 0.9151667 0.9901970 0.9683636 0.7586355 0.9641818 0.1866805 0.0369775 0.0804990 0.0533067 0.1568106 0.0347644 0.0540763 0.1423705 0.0356679 0.1466057 0.0356679 0.0540763 0.0470783 0.0763720
1 0.100 0.1087790 0.9910758 0.7358526 0.9681868 0.9139178 0.9788797 0.9643636 0.9833333 0.9952727 0.9008333 0.9952727 0.9643636 0.7556227 0.9738485 0.1062068 0.0224154 0.0561177 0.0468109 0.1319824 0.0306391 0.0495017 0.0870388 0.0244304 0.1409917 0.0244304 0.0495017 0.0458253 0.0536654
2 0.000 0.8713311 0.9445303 0.2215160 0.9254396 0.8020485 0.9497929 0.9279091 0.9216667 0.9780631 0.8191667 0.9780631 0.9279091 0.7266758 0.9247879 1.2468013 0.0771879 0.1583894 0.0754308 0.1963218 0.0519819 0.0850181 0.1597118 0.0441906 0.1990913 0.0441906 0.0850181 0.0665488 0.0901811
2 0.001 0.2335906 0.9801970 0.5219896 0.9542491 0.8639865 0.9704549 0.9644545 0.9150000 0.9789161 0.8930000 0.9789161 0.9644545 0.7556777 0.9397273 0.3501981 0.0467251 0.1275422 0.0625401 0.1961586 0.0399068 0.0551859 0.1857659 0.0461698 0.1696398 0.0461698 0.0551859 0.0492753 0.1012810
2 0.010 0.1193773 0.9925303 0.7279619 0.9673535 0.9024864 0.9789245 0.9753636 0.9383333 0.9846667 0.9311667 0.9846667 0.9753636 0.7642125 0.9568485 0.2001144 0.0222360 0.0777311 0.0520198 0.1612972 0.0334747 0.0491228 0.1618062 0.0401364 0.1368159 0.0401364 0.0491228 0.0451471 0.0844918
2 0.100 0.1050572 0.9920758 0.7350110 0.9611905 0.8915437 0.9744515 0.9614545 0.9633333 0.9904545 0.8985000 0.9904545 0.9614545 0.7532418 0.9623939 0.1007447 0.0199000 0.0593077 0.0475749 0.1404126 0.0310515 0.0540553 0.1331227 0.0341630 0.1415621 0.0341630 0.0540553 0.0473372 0.0666866
3 0.000 1.0845910 0.9485606 0.2054326 0.9357875 0.8242593 0.9567308 0.9443636 0.9066667 0.9752721 0.8624524 0.9752721 0.9443636 0.7401648 0.9255152 1.7708765 0.0876953 0.1461095 0.0762215 0.2028168 0.0537167 0.0852830 0.1885023 0.0494961 0.1868093 0.0494961 0.0852830 0.0729153 0.1009127
3 0.001 0.2098058 0.9845455 0.5821138 0.9572161 0.8671759 0.9726295 0.9713636 0.9000000 0.9761010 0.9145000 0.9761010 0.9713636 0.7612546 0.9356818 0.3557607 0.0383500 0.1354290 0.0609149 0.1942101 0.0389693 0.0513508 0.1981961 0.0476030 0.1543906 0.0476030 0.0513508 0.0492298 0.1054267
3 0.010 0.1351463 0.9881061 0.7368987 0.9595879 0.8794839 0.9739110 0.9683636 0.9266667 0.9818889 0.9118333 0.9818889 0.9683636 0.7588187 0.9475152 0.1785091 0.0263854 0.0618399 0.0549889 0.1718742 0.0351137 0.0505497 0.1777778 0.0438450 0.1441196 0.0438450 0.0505497 0.0475791 0.0924488
3 0.100 0.1146974 0.9899848 0.7463126 0.9542125 0.8718296 0.9698931 0.9563636 0.9466667 0.9859091 0.8753333 0.9859091 0.9563636 0.7494048 0.9515152 0.1172967 0.0221751 0.0542242 0.0573506 0.1700563 0.0371342 0.0549723 0.1457979 0.0383823 0.1615977 0.0383823 0.0549723 0.0500965 0.0823679
4 0.000 1.1106475 0.9490985 0.1949241 0.9384432 0.8210393 0.9594877 0.9483636 0.8983333 0.9752348 0.8611667 0.9752348 0.9483636 0.7431868 0.9233485 1.5718618 0.0961268 0.1438618 0.0681963 0.1981827 0.0461930 0.0715222 0.2022530 0.0497189 0.1777471 0.0497189 0.0715222 0.0617925 0.1049746
4 0.001 0.2091086 0.9827727 0.5898132 0.9525275 0.8558818 0.9695423 0.9655455 0.9066667 0.9765859 0.8978333 0.9765859 0.9655455 0.7562546 0.9361061 0.3436125 0.0557496 0.1284707 0.0606576 0.1995886 0.0386625 0.0566225 0.2000561 0.0481087 0.1786069 0.0481087 0.0566225 0.0459552 0.1039485
4 0.010 0.1258586 0.9879848 0.7376265 0.9606136 0.8850872 0.9743915 0.9674545 0.9350000 0.9837576 0.9110000 0.9837576 0.9674545 0.7581777 0.9512273 0.1757399 0.0251479 0.0599906 0.0519985 0.1608267 0.0333442 0.0504455 0.1639424 0.0408095 0.1423748 0.0408095 0.0504455 0.0486254 0.0851542
4 0.100 0.1197274 0.9914848 0.7494085 0.9550916 0.8731767 0.9704485 0.9585455 0.9416667 0.9850606 0.8848333 0.9850606 0.9585455 0.7511172 0.9501061 0.1272883 0.0209390 0.0551775 0.0584203 0.1722116 0.0382379 0.0581072 0.1523755 0.0386781 0.1611389 0.0386781 0.0581072 0.0522333 0.0845526
5 0.000 1.4844251 0.9341970 0.2204806 0.9267033 0.7905185 0.9516851 0.9377273 0.8900000 0.9720101 0.8360000 0.9720101 0.9377273 0.7343315 0.9138636 1.9376114 0.1003944 0.1472727 0.0736215 0.2113986 0.0497261 0.0802718 0.2148003 0.0545606 0.1955125 0.0545606 0.0802718 0.0619563 0.1095312
5 0.001 0.2700569 0.9823485 0.6131071 0.9430495 0.8172620 0.9639542 0.9643636 0.8633333 0.9672727 0.8944444 0.9672727 0.9643636 0.7555495 0.9138485 0.3364096 0.0413656 0.1101584 0.0615674 0.2172696 0.0385740 0.0537543 0.2431785 0.0573069 0.1577729 0.0573069 0.0537543 0.0472171 0.1209254
5 0.010 0.1454078 0.9903182 0.7388525 0.9565751 0.8705556 0.9719409 0.9654545 0.9233333 0.9812424 0.9036667 0.9812424 0.9654545 0.7565110 0.9443939 0.2048979 0.0235475 0.0597327 0.0565236 0.1796414 0.0360260 0.0530401 0.1856526 0.0455161 0.1513238 0.0455161 0.0530401 0.0486663 0.0959121
5 0.100 0.1363676 0.9893182 0.7488111 0.9443315 0.8468096 0.9629629 0.9494545 0.9283333 0.9805455 0.8630000 0.9805455 0.9494545 0.7440751 0.9388939 0.1482722 0.0248614 0.0513556 0.0684665 0.1938225 0.0457691 0.0698025 0.1695955 0.0459208 0.1800296 0.0459208 0.0698025 0.0610713 0.0942881
6 0.000 1.7210208 0.9289773 0.2324320 0.9255769 0.7804123 0.9515294 0.9445455 0.8566667 0.9630707 0.8449495 0.9630707 0.9445455 0.7401465 0.9006061 1.9075088 0.0983390 0.1185889 0.0691374 0.2125315 0.0457063 0.0695291 0.2209814 0.0551130 0.1845116 0.0551130 0.0695291 0.0600103 0.1130694
6 0.001 0.2496095 0.9816667 0.6064159 0.9450458 0.8417122 0.9640438 0.9563636 0.9050000 0.9751414 0.8815000 0.9751414 0.9563636 0.7495971 0.9306818 0.3149065 0.0366029 0.1064285 0.0643350 0.1907881 0.0421624 0.0618873 0.1913608 0.0495693 0.1656009 0.0495693 0.0618873 0.0573419 0.1006129
6 0.010 0.1543773 0.9873485 0.7378129 0.9496520 0.8545411 0.9670286 0.9594545 0.9133333 0.9776894 0.8875000 0.9776894 0.9594545 0.7518956 0.9363939 0.2084022 0.0296661 0.0602608 0.0662722 0.1937802 0.0440767 0.0630422 0.1842265 0.0471581 0.1678472 0.0471581 0.0630422 0.0564323 0.1004932
6 0.100 0.1334211 0.9899848 0.7503616 0.9417033 0.8367290 0.9612967 0.9494545 0.9133333 0.9777879 0.8676667 0.9777879 0.9494545 0.7439469 0.9313939 0.1431831 0.0237631 0.0496845 0.0684000 0.1930779 0.0462589 0.0740166 0.1780301 0.0453638 0.1823561 0.0453638 0.0740166 0.0626755 0.0955998
7 0.000 1.7596023 0.9348636 0.1909701 0.9225549 0.7836474 0.9483130 0.9347273 0.8800000 0.9685202 0.8314167 0.9685202 0.9347273 0.7323810 0.9073636 2.0137462 0.0882078 0.1166125 0.0726406 0.1930576 0.0523801 0.0836733 0.1954591 0.0507226 0.1882560 0.0507226 0.0836733 0.0690621 0.1007283
7 0.001 0.2430545 0.9842121 0.6339229 0.9432418 0.8262089 0.9637087 0.9614545 0.8750000 0.9690404 0.8876667 0.9690404 0.9614545 0.7533700 0.9182273 0.3262882 0.0365101 0.0998506 0.0611556 0.1950832 0.0388494 0.0540553 0.2136937 0.0527960 0.1622585 0.0527960 0.0540553 0.0491860 0.1092709
7 0.010 0.1512587 0.9870152 0.7286288 0.9517674 0.8525789 0.9688749 0.9633636 0.9066667 0.9773434 0.8944444 0.9773434 0.9633636 0.7549084 0.9350152 0.2287421 0.0292337 0.0685620 0.0603196 0.2009994 0.0387797 0.0558529 0.2042203 0.0481618 0.1591146 0.0481618 0.0558529 0.0508499 0.1055460
7 0.100 0.1531452 0.9852273 0.7440966 0.9370879 0.8207285 0.9586769 0.9473636 0.9016667 0.9742424 0.8510000 0.9742424 0.9473636 0.7423443 0.9245152 0.1549458 0.0283233 0.0526583 0.0668738 0.1985956 0.0441175 0.0671196 0.1925885 0.0503625 0.1809040 0.0503625 0.0671196 0.0580246 0.1018392
8 0.000 1.5104599 0.9357955 0.2118272 0.9213645 0.7707425 0.9485443 0.9415455 0.8483333 0.9609899 0.8359524 0.9609899 0.9415455 0.7375275 0.8949394 1.7780735 0.0980147 0.1292419 0.0785345 0.2266485 0.0532336 0.0800519 0.2223421 0.0569482 0.2049289 0.0569482 0.0800519 0.0641469 0.1182499
8 0.001 0.2440485 0.9829545 0.6401755 0.9379212 0.8162872 0.9597482 0.9544545 0.8766667 0.9688485 0.8718333 0.9688485 0.9544545 0.7479853 0.9155606 0.3359646 0.0351149 0.0941063 0.0631467 0.1935481 0.0409491 0.0619883 0.2059987 0.0518786 0.1725997 0.0518786 0.0619883 0.0558031 0.1054877
8 0.010 0.1633595 0.9825303 0.7239732 0.9448352 0.8419274 0.9634558 0.9524545 0.9150000 0.9792121 0.8747857 0.9792121 0.9524545 0.7463736 0.9337273 0.2419195 0.0409855 0.0651841 0.0645890 0.1850330 0.0447044 0.0714401 0.1857659 0.0456043 0.1694565 0.0456043 0.0714401 0.0615804 0.0958432
8 0.100 0.1451041 0.9875303 0.7488223 0.9448443 0.8399538 0.9640373 0.9543636 0.9100000 0.9771515 0.8706667 0.9771515 0.9543636 0.7478571 0.9321818 0.1576930 0.0296148 0.0516109 0.0597739 0.1825226 0.0387238 0.0590146 0.1856526 0.0471885 0.1665171 0.0471885 0.0590146 0.0530528 0.0961349
9 0.000 1.8490603 0.9170606 0.1903994 0.9085714 0.7353083 0.9395048 0.9307273 0.8250000 0.9555253 0.8122500 0.9555253 0.9307273 0.7294872 0.8778636 1.9192387 0.1098311 0.1205124 0.0788474 0.2284506 0.0558355 0.0875769 0.2373334 0.0584720 0.2148214 0.0584720 0.0875769 0.0743576 0.1212020
9 0.001 0.2312363 0.9787273 0.6471411 0.9521520 0.8572220 0.9690491 0.9673636 0.8983333 0.9737879 0.9078333 0.9737879 0.9673636 0.7581777 0.9328485 0.3549784 0.0521673 0.0951832 0.0626948 0.1922338 0.0408833 0.0565457 0.1994591 0.0510192 0.1523194 0.0510192 0.0565457 0.0533159 0.1039918
9 0.010 0.1541164 0.9897727 0.7285502 0.9489469 0.8519932 0.9665967 0.9593636 0.9116667 0.9774545 0.8918333 0.9774545 0.9593636 0.7518956 0.9355152 0.1996925 0.0242390 0.0588569 0.0572065 0.1719693 0.0376608 0.0601284 0.1826587 0.0461950 0.1546056 0.0461950 0.0601284 0.0553630 0.0922135
9 0.100 0.1555684 0.9840152 0.7415787 0.9355495 0.8133072 0.9579359 0.9504545 0.8833333 0.9696667 0.8616667 0.9696667 0.9504545 0.7447161 0.9168939 0.1687776 0.0375040 0.0579893 0.0661284 0.1964759 0.0435420 0.0668464 0.2003084 0.0517325 0.1791691 0.0517325 0.0668464 0.0573472 0.1038852
10 0.000 1.8549222 0.9283561 0.1802235 0.9200733 0.7650013 0.9477453 0.9397273 0.8483333 0.9608081 0.8260462 0.9608081 0.9397273 0.7363004 0.8940303 2.0373815 0.1046873 0.1215910 0.0725661 0.2240466 0.0488737 0.0742568 0.2236005 0.0548492 0.2028836 0.0548492 0.0742568 0.0622694 0.1158612
10 0.001 0.2319487 0.9836667 0.6415687 0.9393773 0.8181752 0.9607912 0.9535455 0.8883333 0.9719697 0.8661616 0.9719697 0.9535455 0.7470696 0.9209394 0.3134058 0.0410376 0.0946115 0.0612394 0.2036055 0.0392991 0.0603285 0.2132731 0.0516749 0.1702082 0.0516749 0.0603285 0.0518043 0.1075844
10 0.010 0.1549729 0.9884242 0.7199236 0.9436172 0.8361348 0.9627219 0.9583636 0.8883333 0.9720606 0.8961190 0.9720606 0.9583636 0.7512546 0.9233485 0.2098457 0.0275119 0.0717231 0.0672484 0.1915813 0.0467180 0.0738343 0.2010843 0.0499094 0.1638796 0.0499094 0.0738343 0.0661279 0.1025482
10 0.100 0.1519419 0.9838333 0.7427996 0.9426557 0.8333313 0.9626487 0.9544545 0.9000000 0.9743030 0.8693333 0.9743030 0.9544545 0.7479212 0.9272273 0.1663810 0.0366565 0.0584254 0.0656141 0.1999925 0.0424785 0.0619883 0.1953442 0.0500373 0.1761561 0.0500373 0.0619883 0.0548971 0.1040797
#The optimal selected model
selected_model <- fit_nnet_raman$results %>% 
  filter(size == fit_nnet_raman$bestTune$size & decay == fit_nnet_raman$bestTune$decay)

knitr::kable(selected_model)
size decay logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
1 0.1 0.108779 0.9910758 0.7358526 0.9681868 0.9139178 0.9788797 0.9643636 0.9833333 0.9952727 0.9008333 0.9952727 0.9643636 0.7556227 0.9738485 0.1062068 0.0224154 0.0561177 0.0468109 0.1319824 0.0306391 0.0495017 0.0870388 0.0244304 0.1409917 0.0244304 0.0495017 0.0458253 0.0536654
#FTIR NNET CV results
print(paste('The optimal parameters for training the FTIR-nnet model are',round(as.numeric(fit_nnet_ftir$bestTune$size),2),'neurons','and decay value of',fit_nnet_ftir$bestTune$decay))
[1] "The optimal parameters for training the FTIR-nnet model are 4 neurons and decay value of 0.001"
#Output FTIR NNET table
knitr::kable(fit_nnet_ftir$results)
size decay logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
1 0.000 0.3577433 0.8608788 0.1196058 0.8749084 0.6567946 0.9152654 0.8830909 0.8433333 0.9601746 0.6931698 0.9601746 0.8830909 0.6919505 0.8632121 0.2489098 0.1525429 0.0889690 0.0947925 0.2844093 0.0655573 0.1012158 0.2889860 0.0713963 0.2427277 0.0713963 0.1012158 0.0829754 0.1480983
1 0.001 0.1825109 0.9485758 0.3915552 0.9428571 0.8596306 0.9600671 0.9336364 0.9766667 0.9930152 0.8486667 0.9930152 0.9336364 0.7316484 0.9551515 0.2380362 0.0795843 0.1798705 0.0762085 0.1800557 0.0545050 0.0906286 0.0854775 0.0257477 0.1903786 0.0257477 0.0906286 0.0767086 0.0665539
1 0.010 0.0953800 0.9717803 0.5079353 0.9672344 0.9176795 0.9773940 0.9596364 0.9950000 0.9990909 0.9048333 0.9990909 0.9596364 0.7515842 0.9773182 0.1556429 0.0727079 0.1552244 0.0587774 0.1414190 0.0415194 0.0732114 0.0500000 0.0090909 0.1612642 0.0090909 0.0732114 0.0599438 0.0431631
1 0.100 0.1840391 0.9327348 0.5687311 0.9372711 0.8462062 0.9560403 0.9249091 0.9816667 0.9951818 0.8308333 0.9951818 0.9249091 0.7246978 0.9532879 0.1756076 0.1007638 0.1333927 0.0749156 0.1735074 0.0542539 0.0934236 0.0817012 0.0211315 0.1884334 0.0211315 0.0934236 0.0778941 0.0605023
2 0.000 0.3103969 0.9490606 0.1808753 0.9160165 0.7912089 0.9412204 0.9019091 0.9650000 0.9912929 0.7757857 0.9912929 0.9019091 0.7065201 0.9334545 0.6780673 0.0955459 0.1350106 0.0859843 0.2177644 0.0617631 0.1016378 0.1427150 0.0330206 0.2154171 0.0330206 0.1016378 0.0821406 0.0901189
2 0.001 0.0248864 0.9945000 0.5902468 0.9884615 0.9725091 0.9917805 0.9861818 0.9966667 0.9990000 0.9701667 0.9990000 0.9861818 0.7727473 0.9914242 0.0886497 0.0376837 0.1450096 0.0428708 0.0999040 0.0311697 0.0526052 0.0333333 0.0100000 0.1063446 0.0100000 0.0526052 0.0497662 0.0333870
2 0.010 0.0208146 0.9990000 0.6093735 0.9900000 0.9768835 0.9927967 0.9870000 1.0000000 1.0000000 0.9730000 1.0000000 0.9870000 0.7735165 0.9935000 0.0566436 0.0100000 0.1360313 0.0373259 0.0840467 0.0273530 0.0485237 0.0000000 0.0000000 0.0957216 0.0000000 0.0485237 0.0491004 0.0242618
2 0.100 0.0603370 0.9939091 0.6850827 0.9828755 0.9573057 0.9881788 0.9780909 1.0000000 1.0000000 0.9473333 1.0000000 0.9780909 0.7663919 0.9890455 0.0875024 0.0242338 0.1138303 0.0408709 0.1002424 0.0285142 0.0522488 0.0000000 0.0000000 0.1214327 0.0000000 0.0522488 0.0493783 0.0261244
3 0.000 0.3089128 0.9772197 0.2090314 0.9442857 0.8540316 0.9618686 0.9376364 0.9683333 0.9926061 0.8511785 0.9926061 0.9376364 0.7341484 0.9529848 0.8339518 0.0629124 0.1590597 0.0719667 0.1933545 0.0504011 0.0867268 0.1312442 0.0279075 0.1899883 0.0279075 0.0867268 0.0679550 0.0760162
3 0.001 0.0029790 1.0000000 0.6838636 0.9992308 0.9980597 0.9994737 0.9990000 1.0000000 1.0000000 0.9975000 1.0000000 0.9990000 0.7827473 0.9995000 0.0106796 0.0000000 0.1325429 0.0076923 0.0194030 0.0052632 0.0100000 0.0000000 0.0000000 0.0250000 0.0000000 0.0100000 0.0284674 0.0050000
3 0.010 0.0076740 1.0000000 0.6907879 0.9992308 0.9980597 0.9994737 0.9990000 1.0000000 1.0000000 0.9975000 1.0000000 0.9990000 0.7827473 0.9995000 0.0129347 0.0000000 0.1178469 0.0076923 0.0194030 0.0052632 0.0100000 0.0000000 0.0000000 0.0250000 0.0000000 0.0100000 0.0284674 0.0050000
3 0.100 0.0317612 1.0000000 0.7387879 0.9921154 0.9789179 0.9947368 0.9900000 1.0000000 1.0000000 0.9725000 1.0000000 0.9900000 0.7756319 0.9950000 0.0320757 0.0000000 0.0732939 0.0237913 0.0640855 0.0158690 0.0301511 0.0000000 0.0000000 0.0837992 0.0000000 0.0301511 0.0346244 0.0150756
4 0.000 0.2887457 0.9844318 0.1616859 0.9587821 0.8958262 0.9714809 0.9523636 0.9800000 0.9950000 0.8858333 0.9950000 0.9523636 0.7462729 0.9661818 0.8983619 0.0488649 0.1279258 0.0700922 0.1780759 0.0489247 0.0794749 0.1012825 0.0261116 0.1795739 0.0261116 0.0794749 0.0685319 0.0707759
4 0.001 0.0029259 1.0000000 0.7256212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835165 1.0000000 0.0086218 0.0000000 0.0916436 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0270004 0.0000000
4 0.010 0.0098988 0.9996667 0.7374207 0.9984615 0.9956667 0.9990000 0.9990000 0.9966667 0.9990000 0.9966667 0.9990000 0.9990000 0.7827473 0.9978333 0.0428609 0.0033333 0.0844031 0.0153846 0.0433333 0.0100000 0.0100000 0.0333333 0.0100000 0.0333333 0.0100000 0.0100000 0.0284674 0.0216667
4 0.100 0.0281558 1.0000000 0.7562879 0.9922436 0.9800373 0.9947368 0.9900000 1.0000000 1.0000000 0.9741667 1.0000000 0.9900000 0.7757601 0.9950000 0.0309729 0.0000000 0.0555248 0.0233944 0.0604259 0.0158690 0.0301511 0.0000000 0.0000000 0.0782946 0.0000000 0.0301511 0.0373129 0.0150756
5 0.000 0.1446579 0.9909091 0.1972917 0.9637546 0.9097970 0.9746185 0.9583636 0.9816667 0.9951616 0.9073333 0.9951616 0.9583636 0.7511813 0.9700152 0.4268374 0.0287238 0.1618263 0.0638703 0.1529244 0.0458781 0.0784500 0.0817012 0.0212722 0.1632017 0.0212722 0.0784500 0.0694983 0.0576793
5 0.001 0.0023498 1.0000000 0.7411212 0.9992308 0.9980597 0.9994737 0.9990000 1.0000000 1.0000000 0.9975000 1.0000000 0.9990000 0.7827473 0.9995000 0.0068063 0.0000000 0.0828036 0.0076923 0.0194030 0.0052632 0.0100000 0.0000000 0.0000000 0.0250000 0.0000000 0.0100000 0.0284674 0.0050000
5 0.010 0.0084332 0.9996667 0.7507540 0.9992308 0.9980597 0.9994737 0.9990000 1.0000000 1.0000000 0.9975000 1.0000000 0.9990000 0.7827473 0.9995000 0.0190333 0.0033333 0.0601686 0.0076923 0.0194030 0.0052632 0.0100000 0.0000000 0.0000000 0.0250000 0.0000000 0.0100000 0.0284674 0.0050000
5 0.100 0.0284816 1.0000000 0.7529545 0.9851282 0.9607501 0.9899916 0.9810909 1.0000000 1.0000000 0.9493333 1.0000000 0.9810909 0.7686447 0.9905455 0.0313193 0.0000000 0.0587437 0.0327606 0.0859485 0.0222288 0.0417435 0.0000000 0.0000000 0.1105014 0.0000000 0.0417435 0.0409912 0.0208717
6 0.000 0.2245077 0.9925758 0.1679149 0.9666484 0.9149077 0.9770511 0.9622727 0.9833333 0.9954040 0.9115000 0.9954040 0.9622727 0.7540110 0.9728030 0.6008636 0.0300999 0.1562122 0.0585659 0.1464175 0.0412477 0.0702799 0.0870388 0.0234715 0.1549718 0.0234715 0.0702799 0.0615032 0.0561060
6 0.001 0.0064753 0.9996667 0.7466177 0.9976282 0.9936194 0.9984211 0.9970000 1.0000000 1.0000000 0.9916667 1.0000000 0.9970000 0.7811447 0.9985000 0.0329673 0.0033333 0.0673901 0.0135647 0.0367525 0.0090235 0.0171447 0.0000000 0.0000000 0.0481125 0.0000000 0.0171447 0.0295560 0.0085723
6 0.010 0.0070695 1.0000000 0.7546212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835165 1.0000000 0.0111659 0.0000000 0.0571815 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0270004 0.0000000
6 0.100 0.0282423 1.0000000 0.7546212 0.9860256 0.9638098 0.9905180 0.9820909 1.0000000 1.0000000 0.9535000 1.0000000 0.9820909 0.7695421 0.9910455 0.0332276 0.0000000 0.0571815 0.0319200 0.0820850 0.0218289 0.0409719 0.0000000 0.0000000 0.1049102 0.0000000 0.0409719 0.0426126 0.0204859
7 0.000 0.1183667 0.9951667 0.1931729 0.9688462 0.9186772 0.9787937 0.9671818 0.9766667 0.9936061 0.9211667 0.9936061 0.9671818 0.7577473 0.9719242 0.3471507 0.0222594 0.1515095 0.0564598 0.1450303 0.0393526 0.0663138 0.0977296 0.0263011 0.1492566 0.0263011 0.0663138 0.0575018 0.0586268
7 0.001 0.0037797 1.0000000 0.7549545 0.9984615 0.9961194 0.9989474 0.9980000 1.0000000 1.0000000 0.9950000 1.0000000 0.9980000 0.7819780 0.9990000 0.0120588 0.0000000 0.0555096 0.0108235 0.0273010 0.0074055 0.0140705 0.0000000 0.0000000 0.0351763 0.0000000 0.0140705 0.0298424 0.0070353
7 0.010 0.0090229 1.0000000 0.7562879 0.9984615 0.9961194 0.9989474 0.9980000 1.0000000 1.0000000 0.9950000 1.0000000 0.9980000 0.7819780 0.9990000 0.0155428 0.0000000 0.0555248 0.0108235 0.0273010 0.0074055 0.0140705 0.0000000 0.0000000 0.0351763 0.0000000 0.0140705 0.0298424 0.0070353
7 0.100 0.0281396 0.9996667 0.7557540 0.9851923 0.9621894 0.9898830 0.9810000 1.0000000 1.0000000 0.9520000 1.0000000 0.9810000 0.7687088 0.9905000 0.0345865 0.0033333 0.0555188 0.0344143 0.0866435 0.0237851 0.0442559 0.0000000 0.0000000 0.1088477 0.0000000 0.0442559 0.0449106 0.0221280
8 0.000 0.1301818 0.9936667 0.1930370 0.9759524 0.9392268 0.9833290 0.9720000 0.9900000 0.9973485 0.9346667 0.9973485 0.9720000 0.7617216 0.9810000 0.4780191 0.0265676 0.1398114 0.0487631 0.1165591 0.0351441 0.0620850 0.0571489 0.0151659 0.1328628 0.0151659 0.0620850 0.0568985 0.0404811
8 0.001 0.0025912 1.0000000 0.7282879 0.9992308 0.9980597 0.9994737 0.9990000 1.0000000 1.0000000 0.9975000 1.0000000 0.9990000 0.7827473 0.9995000 0.0090040 0.0000000 0.0927705 0.0076923 0.0194030 0.0052632 0.0100000 0.0000000 0.0000000 0.0250000 0.0000000 0.0100000 0.0284674 0.0050000
8 0.010 0.0083290 0.9996667 0.7557540 0.9992308 0.9980597 0.9994737 0.9990000 1.0000000 1.0000000 0.9975000 1.0000000 0.9990000 0.7827473 0.9995000 0.0195795 0.0033333 0.0555188 0.0076923 0.0194030 0.0052632 0.0100000 0.0000000 0.0000000 0.0250000 0.0000000 0.0100000 0.0284674 0.0050000
8 0.100 0.0301648 0.9996667 0.7557540 0.9835897 0.9581162 0.9887719 0.9790000 1.0000000 1.0000000 0.9471667 1.0000000 0.9790000 0.7671062 0.9895000 0.0393440 0.0033333 0.0555188 0.0371922 0.0931514 0.0258162 0.0477684 0.0000000 0.0000000 0.1160289 0.0000000 0.0477684 0.0464135 0.0238842
9 0.000 0.1281739 0.9930000 0.1571190 0.9860256 0.9634835 0.9904934 0.9850000 0.9900000 0.9971818 0.9625000 0.9971818 0.9850000 0.7718498 0.9875000 0.5405414 0.0316210 0.1352771 0.0371147 0.0937659 0.0260362 0.0435194 0.0571489 0.0161229 0.1007652 0.0161229 0.0435194 0.0445905 0.0371830
9 0.001 0.0032739 1.0000000 0.7276212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835165 1.0000000 0.0074324 0.0000000 0.0959144 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0270004 0.0000000
9 0.010 0.0087662 1.0000000 0.7562879 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835165 1.0000000 0.0129729 0.0000000 0.0555248 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0270004 0.0000000
9 0.100 0.0313414 0.9996667 0.7557540 0.9828205 0.9544764 0.9884653 0.9790909 0.9966667 0.9990000 0.9435000 0.9990000 0.9790909 0.7671062 0.9878788 0.0449958 0.0033333 0.0555188 0.0360193 0.0955351 0.0242878 0.0431752 0.0333333 0.0100000 0.1154675 0.0100000 0.0431752 0.0423745 0.0296134
10 0.000 0.2781637 0.9889167 0.1588627 0.9799267 0.9442609 0.9867567 0.9802727 0.9800000 0.9946061 0.9460000 0.9946061 0.9802727 0.7680586 0.9801364 0.8676534 0.0454133 0.1401852 0.0446798 0.1299737 0.0291333 0.0440851 0.1040159 0.0277417 0.1210949 0.0277417 0.0440851 0.0436127 0.0591311
10 0.001 0.0075479 0.9996667 0.7027843 0.9984615 0.9956667 0.9990000 0.9990000 0.9966667 0.9990000 0.9966667 0.9990000 0.9990000 0.7827473 0.9978333 0.0464424 0.0033333 0.1001518 0.0153846 0.0433333 0.0100000 0.0100000 0.0333333 0.0100000 0.0333333 0.0100000 0.0100000 0.0284674 0.0216667
10 0.010 0.0096864 0.9996667 0.7540874 0.9984615 0.9961194 0.9989474 0.9980000 1.0000000 1.0000000 0.9950000 1.0000000 0.9980000 0.7819780 0.9990000 0.0202328 0.0033333 0.0571599 0.0108235 0.0273010 0.0074055 0.0140705 0.0000000 0.0000000 0.0351763 0.0000000 0.0140705 0.0298424 0.0070353
10 0.100 0.0320789 0.9996667 0.7557540 0.9812821 0.9521498 0.9871846 0.9760909 1.0000000 1.0000000 0.9398333 1.0000000 0.9760909 0.7647985 0.9880455 0.0452144 0.0033333 0.0555188 0.0400028 0.0999553 0.0278485 0.0513554 0.0000000 0.0000000 0.1239523 0.0000000 0.0513554 0.0481481 0.0256777
#The optimal selected model
selected_model <- fit_nnet_ftir$results %>% 
  filter(size == fit_nnet_ftir$bestTune$size & decay == fit_nnet_ftir$bestTune$decay)

knitr::kable(selected_model)
size decay logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
4 0.001 0.0029259 1 0.7256212 1 1 1 1 1 1 1 1 1 0.7835165 1 0.0086218 0 0.0916436 0 0 0 0 0 0 0 0 0 0.0270004 0
#UV-VIS NNET CV results
print(paste('The optimal parameters for training the UVVIS-nnet model are',round(as.numeric(fit_nnet_uvvis$bestTune$size),2),'neurons','and decay value of',fit_nnet_uvvis$bestTune$decay))
[1] "The optimal parameters for training the UVVIS-nnet model are 1 neurons and decay value of 0.1"
#Output UV-Vis NNET table
knitr::kable(fit_nnet_uvvis$results)
size decay logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
1 0.000 0.2446176 0.9950455 0.0290256 0.9845879 0.9598818 0.9895572 0.9812727 0.9966667 0.9990909 0.9520000 0.9990909 0.9812727 0.7689103 0.9889697 0.7579955 0.0165485 0.0744511 0.0346549 0.0887522 0.0238042 0.0437766 0.0333333 0.0090909 0.1088477 0.0090909 0.0437766 0.0436527 0.0269323
1 0.001 0.0008001 1.0000000 0.5046212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835531 1.0000000 0.0010131 0.0000000 0.0881289 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0266917 0.0000000
1 0.010 0.0043592 1.0000000 0.5046212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835531 1.0000000 0.0028787 0.0000000 0.0881289 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0266917 0.0000000
1 0.100 0.0246370 1.0000000 0.5046212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835531 1.0000000 0.0081683 0.0000000 0.0881289 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0266917 0.0000000
2 0.000 0.1176861 0.9985000 0.0692066 0.9937179 0.9828457 0.9958396 0.9930000 0.9966667 0.9990909 0.9808333 0.9990909 0.9930000 0.7780403 0.9948333 0.4932690 0.0085723 0.1198571 0.0214253 0.0589448 0.0141875 0.0256432 0.0333333 0.0090909 0.0709236 0.0090909 0.0256432 0.0326276 0.0207458
2 0.001 0.0008042 1.0000000 0.5046212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835531 1.0000000 0.0014711 0.0000000 0.0881289 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0266917 0.0000000
2 0.010 0.0033238 1.0000000 0.5079545 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835531 1.0000000 0.0037612 0.0000000 0.0952427 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0266917 0.0000000
2 0.100 0.0156707 1.0000000 0.5279545 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835531 1.0000000 0.0085836 0.0000000 0.1142783 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0266917 0.0000000
3 0.000 0.1953899 0.9949242 0.0908011 0.9924176 0.9800977 0.9948922 0.9921818 0.9933333 0.9980909 0.9801667 0.9980909 0.9921818 0.7775092 0.9927576 0.8473852 0.0291806 0.1390010 0.0273086 0.0719965 0.0183977 0.0296351 0.0469018 0.0134465 0.0741467 0.0134465 0.0296351 0.0369812 0.0301915
3 0.001 0.0008859 1.0000000 0.5079545 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835531 1.0000000 0.0017528 0.0000000 0.0952427 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0266917 0.0000000
3 0.010 0.0030457 1.0000000 0.5246212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835531 1.0000000 0.0042905 0.0000000 0.1115837 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0266917 0.0000000
3 0.100 0.0127777 1.0000000 0.5829545 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835531 1.0000000 0.0095750 0.0000000 0.1307121 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0266917 0.0000000
4 0.000 0.1720391 0.9971364 0.1031505 0.9924725 0.9817512 0.9947368 0.9902727 1.0000000 1.0000000 0.9775000 1.0000000 0.9902727 0.7760256 0.9951364 0.8662712 0.0159782 0.1588068 0.0287223 0.0672707 0.0204868 0.0369163 0.0000000 0.0000000 0.0802065 0.0000000 0.0369163 0.0410349 0.0184581
4 0.001 0.0009846 1.0000000 0.5212879 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835531 1.0000000 0.0022368 0.0000000 0.1087192 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0266917 0.0000000
4 0.010 0.0029304 1.0000000 0.5529545 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835531 1.0000000 0.0044942 0.0000000 0.1240191 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0266917 0.0000000
4 0.100 0.0110832 1.0000000 0.6204545 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835531 1.0000000 0.0096632 0.0000000 0.1284372 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0266917 0.0000000
5 0.000 0.0667163 0.9995000 0.1021241 0.9922985 0.9785481 0.9949373 0.9930909 0.9900000 0.9972727 0.9816667 0.9972727 0.9930909 0.7781593 0.9915455 0.3046448 0.0050000 0.1579633 0.0232363 0.0652117 0.0152844 0.0253243 0.0571489 0.0155861 0.0676070 0.0155861 0.0253243 0.0335849 0.0306910
5 0.001 0.0011033 1.0000000 0.5329545 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835531 1.0000000 0.0023182 0.0000000 0.1192739 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0266917 0.0000000
5 0.010 0.0030314 1.0000000 0.5829545 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835531 1.0000000 0.0050706 0.0000000 0.1285476 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0266917 0.0000000
5 0.100 0.0098450 1.0000000 0.6737879 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835531 1.0000000 0.0097753 0.0000000 0.1194072 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0266917 0.0000000
6 0.000 0.0506035 0.9995000 0.1181250 0.9960897 0.9892263 0.9974185 0.9960000 0.9966667 0.9990909 0.9891667 0.9990909 0.9960000 0.7804121 0.9963333 0.3117845 0.0050000 0.1686977 0.0171400 0.0475670 0.0113180 0.0196946 0.0333333 0.0090909 0.0538305 0.0090909 0.0196946 0.0305761 0.0191837
6 0.001 0.0011693 1.0000000 0.5437879 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835531 1.0000000 0.0024872 0.0000000 0.1219520 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0266917 0.0000000
6 0.010 0.0030327 1.0000000 0.6071212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835531 1.0000000 0.0052309 0.0000000 0.1264215 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0266917 0.0000000
6 0.100 0.0095402 1.0000000 0.7304545 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835531 1.0000000 0.0103145 0.0000000 0.0874414 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0266917 0.0000000
7 0.000 0.0907924 0.9968333 0.1014627 0.9906960 0.9733704 0.9938847 0.9910909 0.9883333 0.9972727 0.9766667 0.9972727 0.9910909 0.7766209 0.9897121 0.4024242 0.0199178 0.1385837 0.0253439 0.0746618 0.0166640 0.0284848 0.0680620 0.0155861 0.0749860 0.0155861 0.0284848 0.0356770 0.0361727
7 0.001 0.0011853 1.0000000 0.5779545 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835531 1.0000000 0.0024369 0.0000000 0.1281561 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0266917 0.0000000
7 0.010 0.0029839 1.0000000 0.6387879 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835531 1.0000000 0.0050578 0.0000000 0.1299564 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0266917 0.0000000
7 0.100 0.0089574 1.0000000 0.7571212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835531 1.0000000 0.0102499 0.0000000 0.0540837 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0266917 0.0000000
8 0.000 0.0764173 0.9990000 0.1109907 0.9960897 0.9897388 0.9973684 0.9950000 1.0000000 1.0000000 0.9866667 1.0000000 0.9950000 0.7796429 0.9975000 0.4015509 0.0070353 0.1543953 0.0171400 0.0452335 0.0115286 0.0219043 0.0000000 0.0000000 0.0588898 0.0000000 0.0219043 0.0318033 0.0109521
8 0.001 0.0013769 1.0000000 0.5962879 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835531 1.0000000 0.0028807 0.0000000 0.1292560 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0266917 0.0000000
8 0.010 0.0030121 1.0000000 0.6679545 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835531 1.0000000 0.0052053 0.0000000 0.1273355 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0266917 0.0000000
8 0.100 0.0085980 1.0000000 0.7604545 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835531 1.0000000 0.0104346 0.0000000 0.0443622 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0266917 0.0000000
9 0.000 0.1448973 0.9967576 0.1135156 0.9922894 0.9801453 0.9947811 0.9911818 0.9966667 0.9990909 0.9785000 0.9990909 0.9911818 0.7766117 0.9939242 0.6764419 0.0209193 0.1352095 0.0280230 0.0717214 0.0190003 0.0341242 0.0333333 0.0090909 0.0826502 0.0090909 0.0341242 0.0372396 0.0235382
9 0.001 0.0014523 1.0000000 0.6112879 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835531 1.0000000 0.0030195 0.0000000 0.1314089 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0266917 0.0000000
9 0.010 0.0032324 1.0000000 0.6946212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835531 1.0000000 0.0059079 0.0000000 0.1171767 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0266917 0.0000000
9 0.100 0.0084173 1.0000000 0.7671212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835531 1.0000000 0.0103448 0.0000000 0.0333003 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0266917 0.0000000
10 0.000 0.1132922 0.9980909 0.1133781 0.9938919 0.9838713 0.9958922 0.9922727 1.0000000 1.0000000 0.9793333 1.0000000 0.9922727 0.7774451 0.9961364 0.6158030 0.0114150 0.1440503 0.0231830 0.0606229 0.0157090 0.0293621 0.0000000 0.0000000 0.0771664 0.0000000 0.0293621 0.0339352 0.0146811
10 0.001 0.0013440 1.0000000 0.6337879 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835531 1.0000000 0.0028042 0.0000000 0.1273963 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0266917 0.0000000
10 0.010 0.0031954 1.0000000 0.7304545 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835531 1.0000000 0.0056483 0.0000000 0.0905089 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0266917 0.0000000
10 0.100 0.0083065 1.0000000 0.7671212 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.7835531 1.0000000 0.0102767 0.0000000 0.0333003 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0266917 0.0000000
#The optimal selected model
selected_model <- fit_nnet_uvvis$results %>% 
  filter(size == fit_nnet_uvvis$bestTune$size & decay == fit_nnet_uvvis$bestTune$decay)

knitr::kable(selected_model)
size decay logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
1 0.1 0.024637 1 0.5046212 1 1 1 1 1 1 1 1 1 0.7835531 1 0.0081683 0 0.0881289 0 0 0 0 0 0 0 0 0 0.0266917 0
#GC-MS NNET CV results
print(paste('The optimal parameters for training the gc-nnet model are sigma value of',round(as.numeric(fit_nnet_gc$bestTune$size),2),'and Cost parameter of',fit_nnet_gc$bestTune$decay))
[1] "The optimal parameters for training the gc-nnet model are sigma value of 3 and Cost parameter of 0.1"
#Output GC NNET  table
knitr::kable(fit_nnet_gc$results)
size decay logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
1 0.000 0.5323597 0.9498056 0.1241799 0.9374269 0.7974261 0.9610449 0.9422500 0.9133333 0.9833063 0.7998333 0.9833063 0.9422500 0.7868421 0.9277917 0.8330855 0.0772727 0.1163596 0.0530353 0.1657801 0.0337875 0.0591276 0.1615025 0.0307903 0.1833945 0.0307903 0.0591276 0.0491801 0.0823102
1 0.001 0.2274131 0.9736528 0.5492737 0.9351170 0.7933307 0.9592595 0.9407917 0.9066667 0.9818365 0.7981667 0.9818365 0.9407917 0.7856433 0.9237292 0.2551404 0.0391373 0.1195339 0.0579748 0.1627618 0.0389868 0.0667910 0.1504203 0.0293379 0.1838285 0.0293379 0.0667910 0.0558210 0.0781427
1 0.010 0.2030072 0.9720972 0.7091095 0.9313158 0.7811160 0.9568295 0.9362500 0.9066667 0.9820567 0.7822857 0.9820567 0.9362500 0.7818421 0.9214583 0.2028186 0.0406474 0.0632329 0.0584108 0.1681987 0.0393387 0.0679059 0.1646682 0.0315750 0.1838735 0.0315750 0.0679059 0.0566430 0.0833057
1 0.100 0.1994870 0.9716667 0.7270153 0.9225146 0.7692664 0.9504076 0.9171667 0.9500000 0.9901782 0.7374048 0.9901782 0.9171667 0.7659064 0.9335833 0.1401607 0.0413630 0.0644781 0.0621792 0.1633334 0.0424858 0.0734317 0.1286640 0.0251760 0.1785476 0.0251760 0.0734317 0.0612829 0.0696703
2 0.000 0.9153586 0.9272222 0.1831516 0.9323392 0.7705045 0.9584090 0.9467083 0.8600000 0.9729874 0.8012857 0.9729874 0.9467083 0.7905848 0.9033542 1.1217657 0.0893960 0.1496486 0.0508320 0.1631554 0.0323189 0.0552812 0.1784080 0.0342354 0.1787855 0.0342354 0.0552812 0.0462597 0.0892329
2 0.001 0.4029346 0.9712500 0.5218099 0.9302339 0.7714795 0.9565051 0.9394583 0.8833333 0.9776648 0.7871667 0.9776648 0.9394583 0.7845322 0.9113958 0.5225417 0.0460725 0.1112703 0.0552946 0.1638920 0.0370433 0.0642733 0.1732699 0.0328782 0.1808425 0.0328782 0.0642733 0.0537530 0.0860125
2 0.010 0.2214149 0.9721667 0.6899509 0.9423099 0.8060396 0.9643554 0.9525833 0.8900000 0.9790837 0.8264167 0.9790837 0.9525833 0.7954971 0.9212917 0.2728628 0.0421446 0.0815429 0.0521616 0.1644211 0.0338078 0.0572432 0.1711846 0.0323976 0.1762006 0.0323976 0.0572432 0.0479916 0.0866206
2 0.100 0.1617179 0.9787222 0.7658274 0.9483626 0.8313846 0.9679369 0.9545833 0.9166667 0.9838722 0.8399524 0.9838722 0.9545833 0.7971637 0.9356250 0.1220702 0.0339822 0.0493215 0.0531268 0.1630464 0.0339074 0.0570152 0.1450647 0.0281481 0.1807053 0.0281481 0.0570152 0.0477314 0.0777758
3 0.000 1.1588788 0.9249792 0.1866538 0.9313450 0.7674264 0.9575596 0.9460417 0.8566667 0.9725451 0.8033333 0.9725451 0.9460417 0.7900292 0.9013542 1.3585094 0.0962205 0.1218216 0.0634305 0.2008457 0.0414349 0.0651322 0.2078968 0.0395745 0.1991565 0.0395745 0.0651322 0.0544997 0.1079000
3 0.001 0.4454402 0.9598819 0.4689113 0.9323977 0.7655045 0.9585819 0.9492500 0.8466667 0.9705788 0.8058333 0.9705788 0.9492500 0.7927193 0.8979583 0.6119831 0.0589974 0.0985952 0.0638901 0.2180469 0.0400471 0.0595955 0.2192734 0.0413981 0.2114083 0.0413981 0.0595955 0.0500114 0.1159278
3 0.010 0.2179559 0.9714861 0.6913097 0.9456725 0.8196147 0.9664048 0.9540000 0.9033333 0.9814752 0.8376190 0.9814752 0.9540000 0.7966667 0.9286667 0.2347374 0.0450949 0.0836970 0.0520693 0.1630684 0.0330797 0.0563607 0.1592279 0.0304010 0.1797490 0.0304010 0.0563607 0.0470733 0.0823022
3 0.100 0.1592419 0.9790833 0.7641156 0.9521930 0.8429052 0.9704126 0.9585000 0.9200000 0.9843113 0.8488333 0.9843113 0.9585000 0.8004386 0.9392500 0.1254594 0.0335023 0.0517721 0.0542753 0.1697914 0.0342583 0.0539344 0.1430782 0.0281553 0.1788361 0.0281553 0.0539344 0.0452301 0.0806484
4 0.000 1.6249456 0.9134931 0.1737578 0.9335088 0.7778181 0.9587060 0.9466250 0.8666667 0.9744050 0.8092857 0.9744050 0.9466250 0.7905263 0.9066458 1.6087717 0.0969157 0.1120403 0.0633516 0.1935567 0.0418598 0.0671220 0.1953442 0.0372806 0.1970718 0.0372806 0.0671220 0.0562492 0.1019304
4 0.001 0.4024334 0.9630000 0.4906746 0.9383918 0.7882719 0.9622022 0.9531250 0.8633333 0.9738642 0.8224524 0.9738642 0.9531250 0.7959649 0.9082292 0.5117458 0.0584303 0.0858199 0.0578743 0.1939883 0.0363598 0.0569441 0.1958893 0.0371128 0.1926554 0.0371128 0.0569441 0.0479785 0.1024365
4 0.010 0.2181948 0.9695417 0.6891696 0.9411696 0.7972671 0.9640100 0.9552500 0.8700000 0.9748485 0.8280000 0.9748485 0.9552500 0.7977193 0.9126250 0.2300551 0.0437905 0.0812479 0.0590077 0.1989418 0.0366838 0.0540212 0.1947400 0.0373682 0.1916741 0.0373682 0.0540212 0.0452148 0.1046327
4 0.100 0.1540334 0.9800000 0.7661420 0.9483626 0.8344981 0.9677698 0.9539167 0.9200000 0.9840897 0.8395000 0.9840897 0.9539167 0.7966082 0.9369583 0.1222264 0.0328155 0.0505285 0.0608422 0.1807018 0.0390670 0.0615110 0.1430782 0.0286205 0.1898617 0.0286205 0.0615110 0.0514690 0.0835858
5 0.000 1.5981305 0.9205903 0.1794788 0.9273977 0.7567341 0.9550833 0.9412500 0.8566667 0.9725874 0.7900714 0.9725874 0.9412500 0.7860526 0.8989583 1.6375131 0.0974914 0.1022852 0.0603989 0.1943784 0.0385391 0.0639726 0.2024264 0.0384216 0.2021685 0.0384216 0.0639726 0.0538563 0.1033921
5 0.001 0.4933962 0.9594514 0.4722007 0.9291228 0.7549498 0.9564675 0.9478750 0.8333333 0.9682030 0.8047381 0.9682030 0.9478750 0.7915789 0.8906042 0.6166813 0.0634696 0.0940829 0.0627730 0.2084976 0.0398500 0.0628721 0.2145127 0.0400507 0.2018310 0.0400507 0.0628721 0.0528479 0.1114068
5 0.010 0.2322110 0.9660417 0.6738129 0.9445322 0.8081233 0.9659983 0.9572083 0.8800000 0.9770943 0.8395238 0.9770943 0.9572083 0.7993567 0.9186042 0.2406788 0.0494780 0.0795972 0.0541448 0.1763874 0.0346803 0.0536963 0.1866474 0.0351615 0.1732358 0.0351615 0.0536963 0.0449724 0.0969021
5 0.100 0.1539082 0.9808472 0.7658626 0.9511111 0.8401404 0.9696959 0.9558750 0.9266667 0.9857399 0.8408333 0.9857399 0.9558750 0.7982456 0.9412708 0.1258120 0.0316803 0.0489489 0.0514770 0.1598343 0.0326144 0.0534418 0.1387777 0.0270715 0.1737287 0.0270715 0.0534418 0.0448035 0.0760361
6 0.000 1.6094499 0.9126806 0.1618964 0.9379532 0.7781681 0.9623724 0.9578750 0.8366667 0.9689668 0.8305000 0.9689668 0.9578750 0.7999123 0.8972708 1.6218077 0.0980740 0.1066221 0.0578225 0.2063406 0.0355147 0.0519063 0.2144866 0.0402316 0.1944957 0.0402316 0.0519063 0.0434934 0.1121240
6 0.001 0.4306026 0.9609583 0.4875787 0.9362865 0.7686494 0.9614164 0.9571250 0.8300000 0.9682039 0.8293333 0.9682039 0.9571250 0.7992982 0.8935625 0.5658018 0.0641657 0.0918779 0.0578236 0.2141758 0.0353959 0.0528883 0.2344849 0.0430466 0.1949290 0.0430466 0.0528883 0.0445105 0.1195795
6 0.010 0.2457540 0.9670000 0.6716427 0.9400877 0.7893989 0.9634724 0.9564583 0.8566667 0.9726796 0.8320000 0.9726796 0.9564583 0.7987427 0.9065625 0.2515249 0.0472487 0.0813080 0.0560030 0.1939898 0.0348038 0.0527624 0.2024264 0.0377193 0.1845260 0.0377193 0.0527624 0.0444281 0.1054181
6 0.100 0.1529933 0.9802639 0.7627186 0.9516667 0.8461463 0.9697024 0.9552083 0.9333333 0.9869260 0.8470238 0.9869260 0.9552083 0.7976901 0.9442708 0.1310787 0.0300179 0.0487765 0.0579581 0.1660862 0.0379686 0.0618764 0.1340050 0.0264788 0.1799357 0.0264788 0.0618764 0.0518142 0.0763236
7 0.000 1.7890126 0.9213681 0.1359811 0.9285965 0.7613408 0.9559452 0.9434583 0.8533333 0.9714573 0.7941190 0.9714573 0.9434583 0.7878655 0.8983958 1.5175932 0.0825715 0.0938188 0.0619422 0.1971608 0.0390971 0.0620742 0.1913975 0.0375181 0.2058161 0.0375181 0.0620742 0.0518536 0.1018849
7 0.001 0.4634445 0.9570833 0.4960259 0.9357602 0.7646166 0.9612272 0.9585417 0.8200000 0.9660129 0.8282828 0.9660129 0.9585417 0.8004678 0.8892708 0.5937747 0.0664721 0.0756261 0.0574763 0.2180964 0.0350323 0.0503682 0.2292818 0.0419996 0.1924371 0.0419996 0.0503682 0.0421827 0.1182883
7 0.010 0.2422127 0.9668333 0.6705531 0.9410819 0.7926969 0.9641134 0.9564167 0.8633333 0.9739647 0.8295000 0.9739647 0.9564167 0.7987135 0.9098750 0.2466525 0.0439184 0.0847138 0.0557973 0.1971183 0.0344200 0.0510497 0.2070313 0.0388691 0.1856504 0.0388691 0.0510497 0.0431234 0.1076683
7 0.100 0.1537769 0.9795556 0.7558910 0.9521930 0.8445497 0.9702831 0.9558750 0.9333333 0.9870973 0.8416190 0.9870973 0.9558750 0.7982456 0.9446042 0.1336694 0.0326721 0.0529904 0.0507119 0.1551499 0.0324949 0.0542753 0.1340050 0.0259882 0.1695796 0.0259882 0.0542753 0.0454940 0.0727346
8 0.000 1.9168802 0.9128750 0.1437170 0.9287427 0.7500753 0.9565756 0.9500417 0.8200000 0.9653324 0.8029524 0.9653324 0.9500417 0.7933626 0.8850208 1.9137871 0.1025012 0.1087517 0.0616732 0.2101004 0.0383920 0.0564806 0.2087857 0.0393744 0.2032392 0.0393744 0.0564806 0.0471441 0.1115784
8 0.001 0.5120956 0.9517153 0.4933694 0.9395029 0.7809859 0.9633369 0.9610000 0.8300000 0.9680969 0.8488333 0.9680969 0.9610000 0.8025439 0.8955000 0.6723846 0.0708201 0.0975805 0.0562879 0.2016297 0.0347700 0.0535219 0.2196569 0.0405949 0.1906080 0.0405949 0.0535219 0.0452130 0.1119706
8 0.010 0.2437774 0.9677778 0.6679503 0.9418421 0.7921877 0.9646909 0.9592083 0.8533333 0.9722558 0.8405000 0.9722558 0.9592083 0.8010234 0.9062708 0.2336151 0.0424138 0.0906839 0.0546503 0.1944046 0.0337303 0.0504723 0.2082475 0.0385227 0.1810968 0.0385227 0.0504723 0.0422467 0.1075056
8 0.100 0.1544024 0.9795972 0.7506345 0.9527485 0.8453139 0.9707065 0.9578333 0.9266667 0.9857875 0.8485000 0.9857875 0.9578333 0.7998830 0.9422500 0.1363949 0.0332395 0.0528899 0.0515385 0.1604136 0.0326546 0.0538282 0.1387777 0.0269739 0.1739986 0.0269739 0.0538282 0.0451631 0.0758429
9 0.000 1.7310248 0.9146875 0.1325353 0.9291813 0.7458847 0.9570496 0.9493333 0.8266667 0.9670923 0.7888889 0.9670923 0.9493333 0.7927778 0.8880000 1.7319913 0.1135738 0.1032211 0.0591032 0.2273636 0.0359069 0.0512803 0.2344131 0.0431931 0.2071839 0.0431931 0.0512803 0.0429487 0.1212485
9 0.001 0.4639172 0.9608611 0.4888761 0.9389474 0.7805490 0.9630428 0.9604167 0.8300000 0.9676373 0.8393333 0.9676373 0.9604167 0.8020468 0.8952083 0.6037941 0.0532630 0.0821286 0.0565123 0.2002358 0.0347503 0.0508227 0.2091885 0.0394040 0.1924868 0.0394040 0.0508227 0.0428034 0.1093662
9 0.010 0.2509947 0.9663194 0.6729610 0.9400000 0.7854685 0.9635660 0.9590417 0.8433333 0.9702786 0.8378333 0.9702786 0.9590417 0.8009064 0.9011875 0.2615157 0.0483768 0.0885162 0.0593480 0.2111950 0.0365768 0.0532510 0.2194524 0.0410478 0.1942908 0.0410478 0.0532510 0.0449518 0.1147893
9 0.100 0.1555133 0.9802917 0.7467837 0.9544152 0.8516454 0.9716291 0.9591667 0.9300000 0.9863560 0.8555238 0.9863560 0.9591667 0.8009942 0.9445833 0.1394048 0.0302670 0.0515603 0.0539076 0.1587915 0.0351050 0.0564698 0.1364534 0.0267762 0.1714656 0.0267762 0.0564698 0.0473101 0.0760472
10 0.000 1.9144550 0.9082708 0.1346851 0.9295029 0.7508718 0.9571489 0.9525000 0.8133333 0.9642028 0.8161667 0.9642028 0.9525000 0.7954386 0.8829167 1.6780203 0.1003057 0.0970285 0.0545918 0.1872527 0.0338295 0.0540419 0.2027864 0.0383979 0.1961535 0.0383979 0.0540419 0.0455451 0.1032676
10 0.001 0.4759291 0.9539583 0.4889414 0.9389766 0.7808606 0.9629822 0.9605000 0.8300000 0.9677562 0.8452857 0.9677562 0.9605000 0.8021053 0.8952500 0.5884418 0.0738460 0.0934829 0.0560187 0.1961949 0.0348004 0.0533675 0.2091885 0.0390957 0.1894108 0.0390957 0.0533675 0.0446971 0.1076684
10 0.010 0.2455030 0.9669167 0.6511054 0.9422807 0.7928037 0.9649892 0.9610417 0.8466667 0.9709900 0.8486667 0.9709900 0.9610417 0.8025731 0.9038542 0.2451394 0.0439407 0.1010037 0.0551250 0.1949850 0.0340627 0.0509222 0.2087857 0.0386447 0.1827382 0.0386447 0.0509222 0.0429704 0.1079057
10 0.100 0.1537238 0.9809028 0.7469208 0.9549415 0.8533237 0.9720201 0.9591667 0.9333333 0.9869723 0.8543333 0.9869723 0.9591667 0.8009942 0.9462500 0.1402808 0.0306577 0.0501028 0.0523372 0.1592715 0.0334219 0.0540321 0.1340050 0.0263046 0.1725985 0.0263046 0.0540321 0.0452901 0.0749602
#The optimal selected model
selected_model<-fit_nnet_gc$results %>% 
  filter(size == fit_nnet_gc$bestTune$size & decay == fit_nnet_gc$bestTune$decay)
knitr::kable(selected_model)
size decay logLoss AUC prAUC Accuracy Kappa F1 Sensitivity Specificity Pos_Pred_Value Neg_Pred_Value Precision Recall Detection_Rate Balanced_Accuracy logLossSD AUCSD prAUCSD AccuracySD KappaSD F1SD SensitivitySD SpecificitySD Pos_Pred_ValueSD Neg_Pred_ValueSD PrecisionSD RecallSD Detection_RateSD Balanced_AccuracySD
3 0.1 0.1592419 0.9790833 0.7641156 0.952193 0.8429052 0.9704126 0.9585 0.92 0.9843113 0.8488333 0.9843113 0.9585 0.8004386 0.93925 0.1254594 0.0335023 0.0517721 0.0542753 0.1697914 0.0342583 0.0539344 0.1430782 0.0281553 0.1788361 0.0281553 0.0539344 0.0452301 0.0806484

Test NNET Models

Hyperspectral Imaging SVM Test Results
#Predict HSI test set
test_hsi_nnet<-predict(fit_nnet_hsi,newdata=data_test_hsi_cf)

#get the confusion matrix
cfmatrix_hsi<-confusionMatrix(test_hsi_nnet,data_test_hsi_cf$class_1)

#print the confusion matrix
print(cfmatrix_hsi)
Confusion Matrix and Statistics

             Reference
Prediction    Adulterated Pure_EVOO
  Adulterated          43         0
  Pure_EVOO             0        11
                                     
               Accuracy : 1          
                 95% CI : (0.934, 1) 
    No Information Rate : 0.7963     
    P-Value [Acc > NIR] : 4.55e-06   
                                     
                  Kappa : 1          
                                     
 Mcnemar's Test P-Value : NA         
                                     
            Sensitivity : 1.0000     
            Specificity : 1.0000     
         Pos Pred Value : 1.0000     
         Neg Pred Value : 1.0000     
             Prevalence : 0.7963     
         Detection Rate : 0.7963     
   Detection Prevalence : 0.7963     
      Balanced Accuracy : 1.0000     
                                     
       'Positive' Class : Adulterated
                                     
knitr::kable(cfmatrix_hsi$byClass)
x
Sensitivity 1.0000000
Specificity 1.0000000
Pos Pred Value 1.0000000
Neg Pred Value 1.0000000
Precision 1.0000000
Recall 1.0000000
F1 1.0000000
Prevalence 0.7962963
Detection Rate 0.7962963
Detection Prevalence 0.7962963
Balanced Accuracy 1.0000000
#View the results as knitr table
knitr::kable(cfmatrix_hsi$table)
Adulterated Pure_EVOO
Adulterated 43 0
Pure_EVOO 0 11
#Calculating Matthews Correlation Coefficient (MC)
# Extracting the components of the confusion matrix
cfmatrix_hsi_nnet_table <- cfmatrix_hsi$table
TP <- cfmatrix_hsi_nnet_table[1,1] 
TN <- cfmatrix_hsi_nnet_table[2,2] 
FP <- cfmatrix_hsi_nnet_table[2,1] 
FN <- cfmatrix_hsi_nnet_table[1,2] 

# Calculating MCC
MCC <- (TP * TN - FP * FN) / sqrt((TP + FP) * (TP + FN) * (TN + FP) * (TN + FN))

# Printing the MCC value
print(paste("The MCC value is for the model",MCC))
[1] "The MCC value is for the model 1"
Test NNET Raman Model on Test Data
#Predict Raman test set
test_raman_nnet<-predict(fit_nnet_raman,newdata=data_test_raman_cf)

#get the confusion matrix
cfmatrix_raman<-confusionMatrix(test_raman_nnet,data_test_raman_cf$class_1)

#print the confusion matrix
print(cfmatrix_raman)
Confusion Matrix and Statistics

             Reference
Prediction    Adulterated Pure_EVOO
  Adulterated          40         2
  Pure_EVOO             3         9
                                         
               Accuracy : 0.9074         
                 95% CI : (0.797, 0.9692)
    No Information Rate : 0.7963         
    P-Value [Acc > NIR] : 0.02431        
                                         
                  Kappa : 0.7239         
                                         
 Mcnemar's Test P-Value : 1.00000        
                                         
            Sensitivity : 0.9302         
            Specificity : 0.8182         
         Pos Pred Value : 0.9524         
         Neg Pred Value : 0.7500         
             Prevalence : 0.7963         
         Detection Rate : 0.7407         
   Detection Prevalence : 0.7778         
      Balanced Accuracy : 0.8742         
                                         
       'Positive' Class : Adulterated    
                                         
knitr::kable(cfmatrix_raman$byClass)
x
Sensitivity 0.9302326
Specificity 0.8181818
Pos Pred Value 0.9523810
Neg Pred Value 0.7500000
Precision 0.9523810
Recall 0.9302326
F1 0.9411765
Prevalence 0.7962963
Detection Rate 0.7407407
Detection Prevalence 0.7777778
Balanced Accuracy 0.8742072
#View the results as knitr table
knitr::kable(cfmatrix_raman$table)
Adulterated Pure_EVOO
Adulterated 40 2
Pure_EVOO 3 9
#Calculating Matthews Correlation Coefficient (MC)
# Extracting the components of the confusion matrix
cfmatrix_raman_nnet_table <- cfmatrix_raman$table
TP <- cfmatrix_raman_nnet_table[1,1] 
TN <- cfmatrix_raman_nnet_table[2,2] 
FP <- cfmatrix_raman_nnet_table[2,1] 
FN <- cfmatrix_raman_nnet_table[1,2] 

# Calculating MCC
MCC <- (TP * TN - FP * FN) / sqrt((TP + FP) * (TP + FN) * (TN + FP) * (TN + FN))

# Printing the MCC value
print(paste("The MCC value is for the model",round(MCC,2)))
[1] "The MCC value is for the model 0.73"
FTIR Spectroscopy NNET Test Results
#Predict FTIR test set
test_ftir_nnet<-predict(fit_nnet_ftir,newdata=data_test_ftir_cf)

#get the confusion matrix
cfmatrix_ftir<-confusionMatrix(test_ftir_nnet,data_test_ftir_cf$class_1)

#print the confusion matrix
print(cfmatrix_ftir)
Confusion Matrix and Statistics

             Reference
Prediction    Adulterated Pure_EVOO
  Adulterated          43         0
  Pure_EVOO             0        11
                                     
               Accuracy : 1          
                 95% CI : (0.934, 1) 
    No Information Rate : 0.7963     
    P-Value [Acc > NIR] : 4.55e-06   
                                     
                  Kappa : 1          
                                     
 Mcnemar's Test P-Value : NA         
                                     
            Sensitivity : 1.0000     
            Specificity : 1.0000     
         Pos Pred Value : 1.0000     
         Neg Pred Value : 1.0000     
             Prevalence : 0.7963     
         Detection Rate : 0.7963     
   Detection Prevalence : 0.7963     
      Balanced Accuracy : 1.0000     
                                     
       'Positive' Class : Adulterated
                                     
knitr::kable(cfmatrix_ftir$byClass)
x
Sensitivity 1.0000000
Specificity 1.0000000
Pos Pred Value 1.0000000
Neg Pred Value 1.0000000
Precision 1.0000000
Recall 1.0000000
F1 1.0000000
Prevalence 0.7962963
Detection Rate 0.7962963
Detection Prevalence 0.7962963
Balanced Accuracy 1.0000000
#View the results as knitr table
knitr::kable(cfmatrix_ftir$table)
Adulterated Pure_EVOO
Adulterated 43 0
Pure_EVOO 0 11
#Calculating Matthews Correlation Coefficient (MC)
# Extracting the components of the confusion matrix
cfmatrix_ftir_nnet_table <- cfmatrix_ftir$table
TP <- cfmatrix_ftir_nnet_table[1,1] 
TN <- cfmatrix_ftir_nnet_table[2,2] 
FP <- cfmatrix_ftir_nnet_table[2,1] 
FN <- cfmatrix_ftir_nnet_table[1,2] 

# Calculating MCC
MCC <- (TP * TN - FP * FN) / sqrt((TP + FP) * (TP + FN) * (TN + FP) * (TN + FN))

# Printing the MCC value
print(paste("The MCC value is for the model",MCC))
[1] "The MCC value is for the model 1"
Assess UV-Vis NNET Model on Test Data
#Predict uvvis test set
test_uvvis_nnet<-predict(fit_nnet_uvvis,newdata=data_test_uvvis_cf)

#get the confusion matrix
cfmatrix_uvvis<-confusionMatrix(test_uvvis_nnet,data_test_uvvis_cf$class_1)

#print the confusion matrix
print(cfmatrix_uvvis)
Confusion Matrix and Statistics

             Reference
Prediction    Adulterated Pure_EVOO
  Adulterated          43         2
  Pure_EVOO             0         9
                                          
               Accuracy : 0.963           
                 95% CI : (0.8725, 0.9955)
    No Information Rate : 0.7963          
    P-Value [Acc > NIR] : 0.0004935       
                                          
                  Kappa : 0.8776          
                                          
 Mcnemar's Test P-Value : 0.4795001       
                                          
            Sensitivity : 1.0000          
            Specificity : 0.8182          
         Pos Pred Value : 0.9556          
         Neg Pred Value : 1.0000          
             Prevalence : 0.7963          
         Detection Rate : 0.7963          
   Detection Prevalence : 0.8333          
      Balanced Accuracy : 0.9091          
                                          
       'Positive' Class : Adulterated     
                                          
knitr::kable(cfmatrix_uvvis$byClass)
x
Sensitivity 1.0000000
Specificity 0.8181818
Pos Pred Value 0.9555556
Neg Pred Value 1.0000000
Precision 0.9555556
Recall 1.0000000
F1 0.9772727
Prevalence 0.7962963
Detection Rate 0.7962963
Detection Prevalence 0.8333333
Balanced Accuracy 0.9090909
#Calculating Matthews Correlation Coefficient (MC)
# Extracting the components of the confusion matrix
cfmatrix_uvvis_nnet_table <- cfmatrix_uvvis$table
TP <- cfmatrix_uvvis_nnet_table[1,1] 
TN <- cfmatrix_uvvis_nnet_table[2,2] 
FP <- cfmatrix_uvvis_nnet_table[2,1] 
FN <- cfmatrix_uvvis_nnet_table[1,2] 

# Calculating MCC
MCC <- (TP * TN - FP * FN) / sqrt((TP + FP) * (TP + FN) * (TN + FP) * (TN + FN))

# Printing the MCC value
print(paste("The MCC value is for the model",round(MCC,2)))
[1] "The MCC value is for the model 0.88"
Assess GC-MS NNET Model on Test Data
#Predict gc test set
test_gc_nnet<-predict(fit_nnet_gc,newdata=data_test_gc_cf)

#get the confusion matrix
cfmatrix_gc<-confusionMatrix(test_gc_nnet,data_test_gc_cf$class_1)

#print the confusion matrix
print(cfmatrix_gc)
Confusion Matrix and Statistics

             Reference
Prediction    Adulterated Pure_EVOO
  Adulterated          59         0
  Pure_EVOO             5        12
                                          
               Accuracy : 0.9342          
                 95% CI : (0.8531, 0.9783)
    No Information Rate : 0.8421          
    P-Value [Acc > NIR] : 0.01371         
                                          
                  Kappa : 0.7884          
                                          
 Mcnemar's Test P-Value : 0.07364         
                                          
            Sensitivity : 0.9219          
            Specificity : 1.0000          
         Pos Pred Value : 1.0000          
         Neg Pred Value : 0.7059          
             Prevalence : 0.8421          
         Detection Rate : 0.7763          
   Detection Prevalence : 0.7763          
      Balanced Accuracy : 0.9609          
                                          
       'Positive' Class : Adulterated     
                                          
knitr::kable(cfmatrix_gc$byClass)
x
Sensitivity 0.9218750
Specificity 1.0000000
Pos Pred Value 1.0000000
Neg Pred Value 0.7058824
Precision 1.0000000
Recall 0.9218750
F1 0.9593496
Prevalence 0.8421053
Detection Rate 0.7763158
Detection Prevalence 0.7763158
Balanced Accuracy 0.9609375
#Calculating Matthews Correlation Coefficient (MC)
# Extracting the components of the confusion matrix
cfmatrix_gc_nnet_table <- cfmatrix_gc$table
TP <- cfmatrix_gc_nnet_table[1,1] 
TN <- cfmatrix_gc_nnet_table[2,2] 
FP <- cfmatrix_gc_nnet_table[2,1] 
FN <- cfmatrix_gc_nnet_table[1,2] 

# Calculating MCC
MCC <- (TP * TN - FP * FN) / sqrt((TP + FP) * (TP + FN) * (TN + FP) * (TN + FN))

# Printing the MCC value
print(paste("The MCC value is for the model",round(MCC,2)))
[1] "The MCC value is for the model 0.81"
Plot Confusion Matrix Tables for NNET Binary Classification Algorithm
# Plotting the confusion matrix

#HSI NNET confusion Matrix
cf_hsi_nnet<-ggplot(data = as.data.frame(cfmatrix_hsi$table), aes(x = Reference, y = Prediction)) +
  geom_tile(aes(fill = Freq), color = "black") +
  geom_text(aes(label = sprintf("%0.0f", Freq)), vjust = 1,color = "black",size = 4) +
  scale_fill_gradient(low = "white", high = "#99ccff", name = "Frequency") +
  theme_minimal() +
  labs(x = "Actual Class", y = "Predicted Class", color = 'black',title = 'Confusion Matrix HSI NNET')+
  theme(
    legend.position =  "none",
    axis.text.y = element_text(color = "black",size = 8),
    axis.title.x  = element_text(size = 8),
    axis.title.y = element_text(size = 8),aspect.ratio = 1,
    plot.title = element_text(size = 8,hjust = 0.5))

#Raman NNET confusion Matrix
cf_raman_nnet<-ggplot(data = as.data.frame(cfmatrix_raman$table), aes(x = Reference, y = Prediction)) +
  geom_tile(aes(fill = Freq), color = "black") +
  geom_text(aes(label = sprintf("%0.0f", Freq)), vjust = 1,color = "black",size = 4) +
  scale_fill_gradient(low = "gray84", high = "darkorange3", name = "Frequency") +
  theme_minimal() +
  labs(x = "Actual Class", y = "Predicted Class", color = 'black',title = 'Confusion Matrix Raman NNET')+
  theme(
    legend.position =  "none",
    axis.text.y = element_text(color = "black",size = 8),
    axis.title.x  = element_text(size = 8),
    axis.title.y = element_text(size = 8),aspect.ratio = 1,
    plot.title = element_text(size = 8,hjust = 0.5))

#FTIR NNET confusion Matrix
cf_ftir_nnet<-ggplot(data = as.data.frame(cfmatrix_ftir$table), aes(x = Reference, y = Prediction)) +
  geom_tile(aes(fill = Freq), color = "black") +
  geom_text(aes(label = sprintf("%0.0f", Freq)), vjust = 1,color = "black",size = 4) +
  scale_fill_gradient(low = "gray84", high = "darkseagreen2", name = "Frequency") +
  theme_minimal() +
  labs(x = "Actual Class", y = "Predicted Class", color = 'black',title = 'Confusion Matrix FTIR NNET')+
  theme(
    legend.position =  "none",
    axis.text.y = element_text(color = "black",size = 8),
    axis.title.x  = element_text(size = 8),
    axis.title.y = element_text(size = 8),aspect.ratio = 1,
    plot.title = element_text(size = 8,hjust = 0.5))

#UV-Vis NNET confusion Matrix
cf_uvvis_nnet<-ggplot(data = as.data.frame(cfmatrix_uvvis$table), aes(x = Reference, y = Prediction)) +
  geom_tile(aes(fill = Freq), color = "black") +
  geom_text(aes(label = sprintf("%0.0f", Freq)), vjust = 1,color = "black",size = 4) +
  scale_fill_gradient(low = "azure1", high = "turquoise", name = "Frequency") +
  theme_minimal() +
  labs(x = "Actual Class", y = "Predicted Class", color = 'black',title = 'Confusion Matrix UV-Vis NNET')+
  theme(
    legend.position =  "none",
    axis.text.y = element_text(color = "black",size = 8),
    axis.title.x  = element_text(size = 8),
    axis.title.y = element_text(size = 8),aspect.ratio = 1,
    plot.title = element_text(size = 8,hjust = 0.5))
library(grid)
grid.arrange(cf_hsi_nnet,cf_raman_nnet,cf_ftir_nnet,cf_uvvis_nnet,nrow = 2)

#GC-MS NNET confusion Matrix
ggplot(data = as.data.frame(cfmatrix_gc$table), aes(x = Reference, y = Prediction)) +
  geom_tile(aes(fill = Freq), color = "black") +
  geom_text(aes(label = sprintf("%0.0f", Freq)), vjust = 1,color = "black",size = 4) +
  scale_fill_gradient(low = "azure1", high = "tan1", name = "Frequency") +
  theme_minimal() +
  labs(x = "Actual Class", y = "Predicted Class", color = 'black',title = 'Confusion Matrix GC-MS NNET')+
  theme(
    legend.position =  "none",
    axis.text.y = element_text(color = "black",size = 8),
    axis.title.x  = element_text(size = 8),
    axis.title.y = element_text(size = 8),aspect.ratio = 1,
    plot.title = element_text(size = 8,hjust = 0.5))

Part IV. Regression Models: Prediction of Adulteration in Extra Virgin Oil

  • This section employs supervised regression models to predict the levels of adulterants in olive oil.
  • The model’s performance is evaluated on the test set using two key metrics: Root Mean Square Error (RMSE) and Residual Prediction Deviation (RPD). RMSE measures the average magnitude of the model’s prediction errors, providing insight into the accuracy of the model by quantifying the difference between observed and predicted values. A lower RMSE indicates better model accuracy.
  • RPD, on the other hand, evaluates the model’s ability to discriminate between different levels of prediction errors. It compares the ratio of the standard deviation of the observed values to the RMSE. An RPD value above 3.0 generally indicates excellent predictive performance. Together, these metrics demonstrate the model’s strong effectiveness and reliability in detecting olive oil adulteration.
#HSI Regression data
#we will use data reduced by PCA

#set seed for reproducibility
set.seed (123)

hsi_reg<-hsi_new[,-1]
train_index_hsi_reg<-createDataPartition(hsi_reg$perc_adulter,p = 0.7,list=FALSE)

#split the data as train and test set
data_train_hsi_reg<-hsi_reg[train_index_hsi_reg,]
data_test_hsi_reg<-hsi_reg[-train_index_hsi_reg,]

#Check the dimensions of the train and test set for HSI data

dim(data_train_hsi_reg)
[1] 130   6
dim(data_test_hsi_reg)
[1] 53  6
#Raman Regression data

#we will use data reduced by PCA

#set seed for reproducibility
set.seed (123)

raman_reg<-raman_new[,-1]
train_index_raman_reg<-createDataPartition(raman_reg$perc_adulter,p = 0.7,list=FALSE)

#split the data as train and test set
data_train_raman_reg<-raman_reg[train_index_raman_reg,]
data_test_raman_reg<-raman_reg[-train_index_raman_reg,]

#Check the dimensions of the train and test set for raman data

dim(data_train_raman_reg)
[1] 130   6
dim(data_test_raman_reg)
[1] 53  6
#FTIR Regression data
#set seed for reproducibility
set.seed (123)

ftir_reg<-ftir_new[,-1]
train_index_ftir_reg<-createDataPartition(ftir_reg$perc_adulter,p = 0.7,list=FALSE)

#split the data as train and test set
data_train_ftir_reg<-ftir_reg[train_index_ftir_reg,]
data_test_ftir_reg<-ftir_reg[-train_index_ftir_reg,]

#Check the dimensions of the train and test set for ftir data

dim(data_train_ftir_reg)
[1] 130   6
dim(data_test_ftir_reg)
[1] 53  6
#UV-VIS Regression data
#set seed for reproducibility
set.seed (123)

uvvis_reg<-uvvis_new[,-1]
train_index_uvvis_reg<-createDataPartition(uvvis_reg$perc_adulter,p = 0.7,list=FALSE)

#split the data as train and test set
data_train_uvvis_reg<-uvvis_reg[train_index_uvvis_reg,]
data_test_uvvis_reg<-uvvis_reg[-train_index_uvvis_reg,]

#Check the dimensions of the train and test set for uvvis data

dim(data_train_uvvis_reg)
[1] 130   6
dim(data_test_uvvis_reg)
[1] 53  6
#GC-MS Regression data
#set seed for reproducibility
set.seed (123)

gc_reg<-gc_new[,-1]
train_index_gc_reg<-createDataPartition(gc_reg$perc_adulter,p = 0.7,list=FALSE)

#split the data as train and test set
data_train_gc_reg<-gc_reg[train_index_gc_reg,]
data_test_gc_reg<-gc_reg[-train_index_gc_reg,]

#Check the dimensions of the train and test set for gc data

dim(data_train_gc_reg)
[1] 182   6
dim(data_test_gc_reg)
[1] 76  6

Set up parameters for training the model using 10 fold repeated 10 times stratified cross-validation.

The optimum model will be selected using the one standard error rule (oneSE): selection of a model with root mean square error of prediction that is within one standard error of the best-performing model to minimize the risk of overfitting.

# Set up the training control(10 folds 10 times cross_validation)
control_r <- trainControl(method = "repeatedcv", number = 10, repeats = 10, 
                          selectionFunction = 'oneSE')

k-Nearest Neighbors (k-NN) Regression

#Register cluster for caret to train the models in parallel
cl<-makeCluster(6,type = "SOCK")
suppressWarnings(suppressMessages(
  registerDoSNOW(cl)))

#start_time
start_time<-Sys.time()

#Set up grid for k neighbors
grid_knn<-expand.grid(.k = seq(3,30, by =2))#Ensure k in as odd number to avoid ties on majority voting

#Train k-NN models for all the techniques

#HSI k-NN model
fit_knn_hsi_reg<-train(perc_adulter~.,data=data_train_hsi_reg,
                       method = "knn",tuneGrid = grid_knn,trControl = control_r,metric = "RMSE")

#Raman k-NN model
fit_knn_raman_reg<-train(y=data_train_raman_reg[,1],x=data_train_raman_reg[,-1],
                         method = "knn",tuneGrid = grid_knn,trControl = control_r,metric = "RMSE")

#FTIR k-NN model
fit_knn_ftir_reg<-train(y=data_train_ftir_reg[,1],x=data_train_ftir_reg[,-1],
                        method = "knn",tuneGrid = grid_knn,trControl = control_r,metric = "RMSE")

#UV-Vis k-NN model
fit_knn_uvvis_reg<-train(y=data_train_uvvis_reg[,1],x=data_train_uvvis_reg[,-1],
                         method = "knn",tuneGrid = grid_knn,trControl = control_r,metric = "RMSE")
#GC-MS k-NN model
fit_knn_gc_reg<-train(y=data_train_gc_reg[,1],x=data_train_gc_reg[,-1],
                      method = "knn",tuneGrid = grid_knn,trControl = control_r,metric = "RMSE")
#End_time
end_time<-Sys.time()
model_training_time<-end_time-start_time
stopCluster(cl)#stop the parallel run cluster
### Plot Hyperparameter for Model kNN Regression Model Selection


#HSI kNN Regression CV Plot
p1_knnr_hsi<-ggplot(fit_knn_hsi_reg)+geom_line(colour = "red")+ 
  theme_bw()+
  theme(
    panel.grid = element_blank())+
  labs(title ='HSI kNN Regression Model Training',y = "RMSECV")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.title = element_text(size =8),
        axis.text = element_text(colour = "black"),
        aspect.ratio = 1)

#Raman kNN Regression CV Plot
p2_knnr_raman<-ggplot(fit_knn_raman_reg)+geom_line(colour = "blue")+ 
  theme_bw()+
  theme(
    panel.grid = element_blank())+
  labs(title ='Raman kNN Regression Model Training',y = "RMSECV")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.title = element_text(size =8),
        axis.text = element_text(colour = "black",size = 8),
        aspect.ratio = 1)

#FTIR kNN Regression CV Plot
p3_knnr_ftir<-ggplot(fit_knn_ftir_reg)+geom_line(colour = "black")+ 
  theme_bw()+
  theme(
    panel.grid = element_blank())+
  labs(title ='FTIR kNN Regression Model Training', y = "RMSECV")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.title = element_text(size =8),
        axis.text = element_text(colour = "black",size = 8),
        aspect.ratio = 1)

#UV-Vis kNN Regression CV Plot
p4_knnr_uvvis<-ggplot(fit_knn_uvvis_reg)+geom_line(colour = "black",linetype = 'dashed')+ 
  theme_bw()+geom_point(pch = 4)+
  theme(
    panel.grid = element_blank())+
  labs(title ='UV-Vis kNN Regression Model Training',y = "RMSECV")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.title = element_text(size =8),
        axis.text = element_text(colour = "black",size =8),
        aspect.ratio = 1)
#Arrange the kNN CV Plots
gridExtra::grid.arrange(p1_knnr_hsi,p2_knnr_raman,p3_knnr_ftir,p4_knnr_uvvis,nrow = 2)

#GC kNN Regression CV Plot
ggplot(fit_knn_gc_reg)+geom_line(colour = "black",linetype = 'dashed')+ 
  theme_bw()+
  theme(
    panel.grid = element_blank())+
  labs(title ='GC-MS kNN Regression Model Training',y = "RMSECV")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.title = element_text(size =8),
        axis.text = element_text(colour = "black",size =8),
        aspect.ratio = 1)

Display cross-validation results for kNN REgression CV models
#HSI CV results
print(paste('The optimal number of k for training the HSI-kNN regression model is',fit_knn_hsi_reg$bestTune))
[1] "The optimal number of k for training the HSI-kNN regression model is 3"
#Output table
knitr::kable(fit_knn_hsi_reg$results)
k RMSE Rsquared MAE RMSESD RsquaredSD MAESD
3 5.405983 0.9162527 3.522384 1.916933 0.1649909 1.257026
5 6.490076 0.9034310 4.410425 2.450123 0.1869288 1.398912
7 10.150687 0.8941987 6.554743 3.398937 0.1927122 1.723607
9 11.634883 0.8861882 7.186105 4.329982 0.1980172 2.066284
11 13.874729 0.8798893 8.137334 4.790863 0.1966519 2.296083
13 15.414597 0.8780556 8.705920 5.153833 0.1881247 2.459693
15 16.389310 0.8753701 9.084936 5.468083 0.1925155 2.593291
17 17.298009 0.8721787 9.463430 5.837238 0.1927531 2.759084
19 18.071282 0.8651685 9.813515 6.124389 0.1961828 2.883991
21 18.627202 0.8623999 10.028668 6.090291 0.1993572 2.882422
23 19.323182 0.8580452 10.333689 6.264508 0.2023210 2.960292
25 19.933847 0.8537361 10.602035 6.458089 0.1997650 3.049638
27 20.374223 0.8489854 10.810954 6.567347 0.2006850 3.107864
29 20.853585 0.8418868 11.051370 6.754189 0.1992564 3.185141
#The optimal selected model
selected_model<-fit_knn_hsi_reg$results %>% filter(k==as.numeric(fit_knn_hsi_reg$bestTune))
knitr::kable(selected_model)
k RMSE Rsquared MAE RMSESD RsquaredSD MAESD
3 5.405983 0.9162527 3.522384 1.916933 0.1649909 1.257026
#Raman CV results
print(paste('The optimal number of k for training the Raman-kNN regression model is',fit_knn_raman_reg$bestTune))
[1] "The optimal number of k for training the Raman-kNN regression model is 3"
#Output table
knitr::kable(fit_knn_raman_reg$results)
k RMSE Rsquared MAE RMSESD RsquaredSD MAESD
3 7.837980 0.8494303 4.743900 5.321805 0.2232947 2.021165
5 8.715419 0.8383172 5.487149 5.335552 0.2429007 2.179668
7 10.402385 0.8357241 6.516441 6.257434 0.2290100 2.615308
9 12.640605 0.8211862 7.813819 6.427865 0.2332443 2.715878
11 14.606034 0.8049412 8.846689 6.519336 0.2502902 2.866281
13 15.951813 0.8019262 9.524310 6.709133 0.2507710 3.021233
15 17.106550 0.8045296 10.057091 6.853357 0.2427625 3.164015
17 18.160003 0.8047508 10.523792 7.131193 0.2381553 3.338157
19 19.101631 0.8040112 10.981585 7.317854 0.2403326 3.458027
21 19.834673 0.8040609 11.290294 7.468808 0.2304022 3.552678
23 20.314074 0.8066949 11.552949 7.526798 0.2366656 3.640498
25 20.743239 0.7994951 11.785877 7.563876 0.2418080 3.726511
27 21.332916 0.7812592 12.092062 7.702401 0.2504587 3.787391
29 21.786613 0.7640746 12.345692 7.834008 0.2509757 3.885662
#The optimal selected model
selected_model<-fit_knn_raman_reg$results %>% filter(k==as.numeric(fit_knn_raman_reg$bestTune))
knitr::kable(selected_model)
k RMSE Rsquared MAE RMSESD RsquaredSD MAESD
3 7.83798 0.8494303 4.7439 5.321805 0.2232947 2.021165
#FTIR CV results
print(paste('The optimal number of k for training the ftir-kNN regression model is',fit_knn_ftir_reg$bestTune))
[1] "The optimal number of k for training the ftir-kNN regression model is 5"
#Output table
knitr::kable(fit_knn_ftir_reg$results)
k RMSE Rsquared MAE RMSESD RsquaredSD MAESD
3 20.71330 0.4817322 11.61979 8.783967 0.2924883 4.582204
5 20.54467 0.5161838 12.14143 7.249003 0.2790918 4.260974
7 21.29513 0.4970930 13.32189 5.959748 0.2628737 3.821181
9 23.05693 0.4412734 14.46964 5.691240 0.2434425 3.921297
11 25.06844 0.3429688 15.73473 5.791963 0.2331179 4.012526
13 26.30024 0.2667580 16.33404 6.087802 0.2079709 4.131957
15 27.10847 0.2124650 16.72358 6.417412 0.1948146 4.220525
17 27.57121 0.1888443 16.96195 6.605680 0.1883982 4.233344
19 27.51303 0.1792769 16.99917 6.697154 0.1772035 4.271511
21 27.56002 0.1680976 17.36855 6.654040 0.1830820 4.271426
23 27.72455 0.1576760 17.64112 6.523525 0.1810987 4.227622
25 27.86481 0.1542269 17.69819 6.451341 0.1805326 4.140831
27 27.78053 0.1568564 17.53341 6.553724 0.1871713 4.076932
29 27.73821 0.1542895 17.39766 6.801393 0.1858103 4.081301
#The optimal selected model
selected_model<-fit_knn_ftir_reg$results %>% filter(k==as.numeric(fit_knn_ftir_reg$bestTune))
knitr::kable(selected_model)
k RMSE Rsquared MAE RMSESD RsquaredSD MAESD
5 20.54467 0.5161838 12.14143 7.249003 0.2790918 4.260974
#Uv_Vis CV results
print(paste('The optimal number of k for training the uvvis-kNN regression model is',fit_knn_uvvis_reg$bestTune))
[1] "The optimal number of k for training the uvvis-kNN regression model is 5"
#Output table
knitr::kable(fit_knn_uvvis_reg$results)
k RMSE Rsquared MAE RMSESD RsquaredSD MAESD
3 4.819181 0.9321703 2.754868 2.284013 0.1256416 1.157614
5 4.673807 0.9312442 3.003703 1.625241 0.1367067 1.021082
7 6.409086 0.9146005 3.895936 2.864558 0.1491408 1.452538
9 7.474391 0.9059889 4.415878 3.853449 0.1501540 1.833599
11 8.284583 0.9008230 5.079393 4.183489 0.1491217 1.927230
13 9.902626 0.9006641 6.043584 4.495437 0.1464137 2.054280
15 11.432158 0.8960562 6.796973 4.845633 0.1505725 2.275111
17 12.607736 0.8974339 7.380246 4.784634 0.1533593 2.367025
19 13.657634 0.8985559 7.853574 5.059162 0.1548791 2.574132
21 14.688674 0.8967480 8.289374 5.378716 0.1589107 2.759330
23 15.604124 0.8935033 8.683066 5.749276 0.1602956 2.943100
25 16.411336 0.8905426 8.982574 6.125135 0.1592958 3.104373
27 17.093794 0.8895081 9.292288 6.435805 0.1578571 3.244498
29 17.673852 0.8850172 9.597154 6.682104 0.1625368 3.366707
#The optimal selected model
selected_model<-fit_knn_uvvis_reg$results %>% filter(k==as.numeric(fit_knn_uvvis_reg$bestTune))
knitr::kable(selected_model)
k RMSE Rsquared MAE RMSESD RsquaredSD MAESD
5 4.673807 0.9312442 3.003703 1.625241 0.1367067 1.021082
#GC-MS CV results
print(paste('The optimal number of k for training the gc-kNN regression model is',fit_knn_gc_reg$bestTune))
[1] "The optimal number of k for training the gc-kNN regression model is 3"
#Output table
knitr::kable(fit_knn_gc_reg$results)
k RMSE Rsquared MAE RMSESD RsquaredSD MAESD
3 6.672229 0.9344659 3.915318 4.250346 0.0924702 1.594939
5 7.946577 0.9122086 4.512573 5.159878 0.1129959 1.888181
7 8.506153 0.8992767 4.918491 5.257620 0.1279819 1.907852
9 8.690903 0.8968997 5.131615 5.307935 0.1278980 1.909567
11 8.983284 0.8914055 5.397126 5.265954 0.1303245 1.903457
13 9.482792 0.8768401 5.666986 5.661031 0.1440686 2.030311
15 9.850380 0.8706559 5.990446 5.781591 0.1500542 2.039385
17 10.501781 0.8694535 6.671017 5.802469 0.1514939 2.041315
19 11.526464 0.8667671 7.375223 5.783876 0.1514367 2.078004
21 12.730504 0.8624729 8.039460 5.667993 0.1565450 2.141219
23 13.833208 0.8599049 8.585425 5.541034 0.1603447 2.210763
25 14.853931 0.8581777 9.048086 5.461127 0.1613310 2.283228
27 15.762597 0.8564739 9.459834 5.426760 0.1619975 2.376965
29 16.579831 0.8559815 9.816263 5.423698 0.1600650 2.460341
#The optimal selected model
selected_model<-fit_knn_gc_reg$results %>% filter(k==as.numeric(fit_knn_gc_reg$bestTune))
knitr::kable(selected_model)
k RMSE Rsquared MAE RMSESD RsquaredSD MAESD
3 6.672229 0.9344659 3.915318 4.250346 0.0924702 1.594939

Testing HSI kNN Regression Models

##Use the optimal model to predict the external samples

#Predict the test set from HSI

prediction_knnr_hsi<-predict(fit_knn_hsi_reg,newdata = data_test_hsi_reg)

# Evaluate the model's performance with RMSE and R²
results_knnr_hsi <- postResample(prediction_knnr_hsi,data_test_hsi_reg$perc_adulter)
print(results_knnr_hsi)
     RMSE  Rsquared       MAE 
5.3705463 0.9565908 3.8616352 
RMSE<-results_knnr_hsi[1]
Rsquared<-results_knnr_hsi[2]
print(paste('The RMSE is',round(RMSE,4), 'and the RSquared is', round(Rsquared,4)))
[1] "The RMSE is 5.3705 and the RSquared is 0.9566"
#Calculate the Residual Predictive Deviation (RPD)

# Calculate standard deviation of observed values
sd_observed <- sd(data_test_hsi_reg$perc_adulter)

#Calculate RPD
RPD <- sd_observed/RMSE

print(paste('The RPD value of the kNN HSI model is',round(RPD,2)))
[1] "The RPD value of the kNN HSI model is 4.78"
# ------------------------------------------------------------------------------
  
#### Testing Raman kNN Regression Models


##Use the optimal model to predict the external samples

#Predict the test set from Raman 

prediction_knnr_raman<- predict(fit_knn_raman_reg,newdata = data_test_raman_reg)

# Evaluate the model's performance with RMSE and R²
results_knnr_raman <- postResample(prediction_knnr_raman,data_test_raman_reg$perc_adulter)
print(results_knnr_raman)
     RMSE  Rsquared       MAE 
5.1340521 0.9610114 3.7547170 
RMSE<-results_knnr_raman[1]
Rsquared<-results_knnr_raman[2]
print(paste('The RMSE is',round(RMSE,4), 'and the RSquared is', round(Rsquared,4)))
[1] "The RMSE is 5.1341 and the RSquared is 0.961"
#Calculate the Residual Predictive Deviation (RPD)

# Calculate standard deviation of observed values
sd_observed <- sd(data_test_raman_reg$perc_adulter)

#Calculate RPD
RPD <- sd_observed/RMSE

print(paste('The RPD value of the kNN raman model is',round(RPD,2)))
[1] "The RPD value of the kNN raman model is 5.01"
# ------------------------------------------------------------------------------
#### Testing FTIR kNN Regression Models


##Use the optimal model to predict the external samples

#Predict the test set from FTIR

prediction_knnr_ftir<-predict(fit_knn_ftir_reg,newdata = data_test_ftir_reg)

# Evaluate the model's performance with RMSE and R²
results_knnr_ftir <- postResample(prediction_knnr_ftir,data_test_ftir_reg$perc_adulter)
print(results_knnr_ftir)
      RMSE   Rsquared        MAE 
19.3938524  0.4669979 11.5735849 
RMSE<-results_knnr_ftir[1]
Rsquared<-results_knnr_ftir[2]
print(paste('The RMSE is',round(RMSE,4), 'and the RSquared is', round(Rsquared,4)))
[1] "The RMSE is 19.3939 and the RSquared is 0.467"
#Calculate the Residual Predictive Deviation (RPD)

# Calculate standard deviation of observed values
sd_observed <- sd(data_test_ftir_reg$perc_adulter)

#Calculate RPD
RPD <- sd_observed/RMSE

print(paste('The RPD value of the kNN ftir model is',round(RPD,2)))
[1] "The RPD value of the kNN ftir model is 1.33"
# ------------------------------------------------------------------------------
#### Testing UV-Vis kNN Regression Models


##Use the optimal model to predict the external samples

#Predict the test set from UV-Vis

prediction_knnr_uvvis<-predict(fit_knn_uvvis_reg,newdata = data_test_uvvis_reg)

# Evaluate the model's performance with RMSE and R²
results_knnr_uvvis <- postResample(prediction_knnr_uvvis,data_test_uvvis_reg$perc_adulter)
print(results_knnr_uvvis)
    RMSE Rsquared      MAE 
4.432811 0.971506 2.679245 
RMSE<-results_knnr_uvvis[1]
Rsquared<-results_knnr_uvvis[2]
print(paste('The RMSE is',round(RMSE,4), 'and the RSquared is', round(Rsquared,4)))
[1] "The RMSE is 4.4328 and the RSquared is 0.9715"
#Calculate the Residual Predictive Deviation (RPD)

# Calculate standard deviation of observed values
sd_observed <- sd(data_test_uvvis_reg$perc_adulter)

#Calculate RPD
RPD <- sd_observed/RMSE

print(paste('The RPD value of the kNN UV-Vis model is',round(RPD,2)))
[1] "The RPD value of the kNN UV-Vis model is 5.8"
# ------------------------------------------------------------------------------
#### Testing GC-MS kNN Regression Models


##Use the optimal model to predict the external samples

#Predict the test set from GC-MS

prediction_knnr_gc<-predict(fit_knn_gc_reg,newdata = data_test_gc_reg)

# Evaluate the model's performance with RMSE and R²
results_knnr_gc <- postResample(prediction_knnr_gc,data_test_gc_reg$perc_adulter)
print(results_knnr_gc)
     RMSE  Rsquared       MAE 
5.6893252 0.9598953 3.4210526 
RMSE<-results_knnr_gc[1]
Rsquared<-results_knnr_gc[2]
print(paste('The RMSE is',round(RMSE,4), 'and the RSquared is', round(Rsquared,4)))
[1] "The RMSE is 5.6893 and the RSquared is 0.9599"
#Calculate the Residual Predictive Deviation (RPD)

# Calculate standard deviation of observed values
sd_observed <- sd(data_test_gc_reg$perc_adulter)

#Calculate RPD
RPD <- sd_observed/RMSE

print(paste('The RPD value of the kNN GC model is',round(RPD,2)))
[1] "The RPD value of the kNN GC model is 4.9"
#-------------------------------------------------------------------------------

Random Forest (RF) Regression

#Register cluster for caret to train the models in parallel
cl<-makeCluster(6,type = "SOCK")
suppressWarnings(suppressMessages(
  registerDoSNOW(cl)))

#start_time
start_time<-Sys.time()

#Use tuneLength to find the optimal number of mtry: set to 6

#Train RF models for all the techniques

#HSI RF model
fit_rf_hsi_reg<-train(perc_adulter~.,data=data_train_hsi_reg,
                      method = "rf",tuneLength = 4,trControl = control_r,metric = "RMSE")

#Raman RF model
fit_rf_raman_reg<-train(y=data_train_raman_reg[,1],x=data_train_raman_reg[,-1],
                        method = "rf",tuneLength = 4,trControl = control_r,metric = "RMSE")

#FTIR RF model
fit_rf_ftir_reg<-train(y=data_train_ftir_reg[,1],x=data_train_ftir_reg[,-1],
                       method = "rf",tuneLength = 4,trControl = control_r,metric = "RMSE")

#UV-Vis RF model
fit_rf_uvvis_reg<-train(y=data_train_uvvis_reg[,1],x=data_train_uvvis_reg[,-1],
                        method = "rf",tuneLength = 4,trControl = control_r,metric = "RMSE")
#GC-MS RF model
fit_rf_gc_reg<-train(y=data_train_gc_reg[,1],x=data_train_gc_reg[,-1],
                     method = "rf",tuneLength = 4,trControl = control_r,metric = "RMSE")
#End_time
end_time<-Sys.time()
model_training_time<-end_time-start_time
stopCluster(cl)#stop the parallel run cluster
### Plot Hyperparameter for RF Regression Model Selection


#HSI RF Regression CV Plot
p1_rfr_hsi<-ggplot(fit_rf_hsi_reg)+geom_line(colour = "red")+ 
  theme_bw()+
  theme(
    panel.grid = element_blank())+
  labs(title ='HSI RF Regression Model Training',y = "RMSECV")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.title = element_text(size =8),
        axis.text = element_text(colour = "black"),
        aspect.ratio = 1)

#Raman RF Regression CV Plot
p2_rfr_raman<-ggplot(fit_rf_raman_reg)+geom_line(colour = "blue")+ 
  theme_bw()+
  theme(
    panel.grid = element_blank())+
  labs(title ='Raman RF Regression Model Training',y = "RMSECV")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.title = element_text(size =8),
        axis.text = element_text(colour = "black",size = 8),
        aspect.ratio = 1)

#FTIR RF Regression CV Plot
p3_rfr_ftir<-ggplot(fit_rf_ftir_reg)+geom_line(colour = "black")+ 
  theme_bw()+
  theme(
    panel.grid = element_blank())+
  labs(title ='FTIR RF Regression Model Training', y = "RMSECV")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.title = element_text(size =8),
        axis.text = element_text(colour = "black",size = 8),
        aspect.ratio = 1)

#UV-Vis RF Regression CV Plot
p4_rfr_uvvis<-ggplot(fit_rf_uvvis_reg)+geom_line(colour = "black",linetype = 'dashed')+ 
  theme_bw()+geom_point(pch = 4)+
  theme(
    panel.grid = element_blank())+
  labs(title ='UV-Vis RF Regression Model Training',y = "RMSECV")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.title = element_text(size =8),
        axis.text = element_text(colour = "black",size =8),
        aspect.ratio = 1)
#Arrange the RF CV Plots
gridExtra::grid.arrange(p1_rfr_hsi,p2_rfr_raman,p3_rfr_ftir,p4_rfr_uvvis,nrow = 2)

#GC RF Regression CV Plot
ggplot(fit_rf_gc_reg)+geom_line(colour = "black",linetype = 'dashed')+ 
  theme_bw()+
  theme(
    panel.grid = element_blank())+
  labs(title ='GC-MS RF Regression Model Training',y = "RMSECV")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.title = element_text(size =8),
        axis.text = element_text(colour = "black",size =8),
        aspect.ratio = 1)

Display cross-validation results for RF Regression CV models
#HSI CV results
print(paste('The optimal number of mtry for training the HSI-RF regression model is',fit_rf_hsi_reg$bestTune))
[1] "The optimal number of mtry for training the HSI-RF regression model is 2"
#Output table
knitr::kable(fit_rf_hsi_reg$results)
mtry RMSE Rsquared MAE RMSESD RsquaredSD MAESD
2 8.720824 0.8756256 5.539749 4.888853 0.2006056 2.001334
3 8.517172 0.8753853 5.058670 6.164605 0.2075210 2.240200
4 8.293420 0.8783703 4.694036 6.566834 0.2022858 2.288297
5 8.139738 0.8782620 4.393435 6.715863 0.1989648 2.335293
#The optimal selected model
selected_model<-fit_rf_hsi_reg$results %>% filter(mtry==as.numeric(fit_rf_hsi_reg$bestTune))
knitr::kable(selected_model)
mtry RMSE Rsquared MAE RMSESD RsquaredSD MAESD
2 8.720824 0.8756256 5.539749 4.888853 0.2006056 2.001334
#Raman CV results
print(paste('The optimal number of mtry for training the Raman-RF regression model is',fit_rf_raman_reg$bestTune))
[1] "The optimal number of mtry for training the Raman-RF regression model is 4"
#Output table
knitr::kable(fit_rf_raman_reg$results)
mtry RMSE Rsquared MAE RMSESD RsquaredSD MAESD
2 8.214384 0.9336097 5.458374 4.945422 0.1204914 1.921318
3 5.949583 0.9442512 3.707187 5.369861 0.1133747 1.879597
4 5.142168 0.9420699 2.913737 5.793883 0.1215967 1.914391
5 4.982751 0.9378681 2.600602 6.064019 0.1336055 1.975601
#The optimal selected model
selected_model<-fit_rf_raman_reg$results %>% filter(mtry==as.numeric(fit_rf_raman_reg$bestTune))
knitr::kable(selected_model)
mtry RMSE Rsquared MAE RMSESD RsquaredSD MAESD
4 5.142168 0.9420699 2.913737 5.793883 0.1215967 1.914391
#FTIR CV results
print(paste('The optimal number of mtry for training the FTIR-RF regression model is',fit_rf_ftir_reg$bestTune))
[1] "The optimal number of mtry for training the FTIR-RF regression model is 2"
#Output table
knitr::kable(fit_rf_ftir_reg$results)
mtry RMSE Rsquared MAE RMSESD RsquaredSD MAESD
2 18.28494 0.6064255 11.17437 7.824426 0.2934035 4.068958
3 18.04316 0.6138436 10.86642 8.206654 0.2972340 4.273606
4 18.14398 0.6139742 10.79265 8.341930 0.2971163 4.361703
5 18.37482 0.5971244 10.85853 8.546533 0.3034131 4.507896
#The optimal selected model
selected_model<-fit_rf_ftir_reg$results %>% filter(mtry==as.numeric(fit_rf_ftir_reg$bestTune))
knitr::kable(selected_model)
mtry RMSE Rsquared MAE RMSESD RsquaredSD MAESD
2 18.28494 0.6064255 11.17437 7.824426 0.2934035 4.068958
#Uv_Vis CV results
print(paste('The optimal number of mtry for training the uvvis-RF regression model is',fit_rf_uvvis_reg$bestTune))
[1] "The optimal number of mtry for training the uvvis-RF regression model is 2"
#Output table
knitr::kable(fit_rf_uvvis_reg$results)
mtry RMSE Rsquared MAE RMSESD RsquaredSD MAESD
2 8.027305 0.9158364 4.868164 3.635114 0.1218011 1.717559
3 7.795407 0.9148070 4.609906 3.912179 0.1279376 1.809583
4 8.106138 0.9054858 4.630147 4.702709 0.1353394 2.086111
5 8.584003 0.8927417 4.740410 5.520458 0.1481126 2.358973
#The optimal selected model
selected_model<-fit_rf_uvvis_reg$results %>% filter(mtry==as.numeric(fit_rf_uvvis_reg$bestTune))
knitr::kable(selected_model)
mtry RMSE Rsquared MAE RMSESD RsquaredSD MAESD
2 8.027305 0.9158364 4.868164 3.635114 0.1218011 1.717559
#GC-MS CV results
print(paste('The optimal number of mtry for training the GC-RF regression model is',fit_rf_gc_reg$bestTune))
[1] "The optimal number of mtry for training the GC-RF regression model is 3"
#Output table
knitr::kable(fit_rf_gc_reg$results)
mtry RMSE Rsquared MAE RMSESD RsquaredSD MAESD
2 7.766909 0.8710970 5.673272 3.031110 0.2085142 1.249028
3 6.990309 0.8846144 4.897252 3.893715 0.1780642 1.474022
4 6.783678 0.8890288 4.471481 4.638936 0.1660548 1.661134
5 6.869558 0.8783774 4.293573 5.266748 0.1873236 1.832599
#The optimal selected model
selected_model<-fit_rf_gc_reg$results %>% filter(mtry==as.numeric(fit_rf_gc_reg$bestTune))
knitr::kable(selected_model)
mtry RMSE Rsquared MAE RMSESD RsquaredSD MAESD
3 6.990309 0.8846144 4.897252 3.893715 0.1780642 1.474022

Testing HSI RF Regression Models

##Use the optimal model to predict the external samples

#Predict the test set from HSI

prediction_rfr_hsi<-predict(fit_rf_hsi_reg,newdata = data_test_hsi_reg)

# Evaluate the model's performance with RMSE and R²
results_rfr_hsi <- postResample(prediction_rfr_hsi,data_test_hsi_reg$perc_adulter)
print(results_rfr_hsi)
     RMSE  Rsquared       MAE 
7.8987713 0.9171852 5.0284252 
RMSE<-results_rfr_hsi[1]
Rsquared<-results_rfr_hsi[2]
print(paste('The RMSE is',round(RMSE,4), 'and the RSquared is', round(Rsquared,4)))
[1] "The RMSE is 7.8988 and the RSquared is 0.9172"
#Calculate the Residual Predictive Deviation (RPD)

# Calculate standard deviation of observed values
sd_observed <- sd(data_test_hsi_reg$perc_adulter)

#Calculate RPD
RPD <- sd_observed/RMSE

print(paste('The RPD value of the RF HSI model is',round(RPD,2)))
[1] "The RPD value of the RF HSI model is 3.25"
#### Testing Raman RF Regression Models


##Use the optimal model to predict the external samples

#Predict the test set from Raman 

prediction_knnr_raman<-predict(fit_knn_raman_reg,newdata = data_test_raman_reg)

# Evaluate the model's performance with RMSE and R²
results_knnr_raman <- postResample(prediction_knnr_raman,data_test_raman_reg$perc_adulter)
print(results_knnr_raman)
     RMSE  Rsquared       MAE 
5.1340521 0.9610114 3.7547170 
RMSE<-results_knnr_raman[1]
Rsquared<-results_knnr_raman[2]
print(paste('The RMSE is',round(RMSE,4), 'and the RSquared is', round(Rsquared,4)))
[1] "The RMSE is 5.1341 and the RSquared is 0.961"
#Calculate the Residual Predictive Deviation (RPD)

# Calculate standard deviation of observed values
sd_observed <- sd(data_test_raman_reg$perc_adulter)

#Calculate RPD
RPD <- sd_observed/RMSE

print(paste('The RPD value of the RF Raman model is',round(RPD,2)))
[1] "The RPD value of the RF Raman model is 5.01"
#### Testing FTIR RF Regression Models


##Use the optimal model to predict the external samples

#Predict the test set from FTIR

prediction_rfr_ftir<-predict(fit_rf_ftir_reg,newdata = data_test_ftir_reg)

# Evaluate the model's performance with RMSE and R²
results_rfr_ftir <- postResample(prediction_rfr_ftir,data_test_ftir_reg$perc_adulter)
print(results_rfr_ftir)
     RMSE  Rsquared       MAE 
9.3756475 0.8803745 6.5270579 
RMSE<-results_rfr_ftir[1]
Rsquared<-results_rfr_ftir[2]
print(paste('The RMSE is',round(RMSE,4), 'and the RSquared is', round(Rsquared,4)))
[1] "The RMSE is 9.3756 and the RSquared is 0.8804"
#Calculate the Residual Predictive Deviation (RPD)

# Calculate standard deviation of observed values
sd_observed <- sd(data_test_ftir_reg$perc_adulter)

#Calculate RPD
RPD <- sd_observed/RMSE

print(paste('The RPD value of the RF FTIR model is',round(RPD,2)))
[1] "The RPD value of the RF FTIR model is 2.74"
#### Testing UV-Vis RF Regression Models


##Use the optimal model to predict the external samples

#Predict the test set from UV-Vis

prediction_rfr_uvvis<-predict(fit_rf_uvvis_reg,newdata = data_test_uvvis_reg)

# Evaluate the model's performance with RMSE and R²
results_rfr_uvvis <- postResample(prediction_rfr_uvvis,data_test_uvvis_reg$perc_adulter)
print(results_rfr_uvvis)
     RMSE  Rsquared       MAE 
8.1192168 0.8997566 4.5645044 
RMSE<-results_rfr_uvvis[1]
Rsquared<-results_rfr_uvvis[2]
print(paste('The RMSE is',round(RMSE,4), 'and the RSquared is', round(Rsquared,4)))
[1] "The RMSE is 8.1192 and the RSquared is 0.8998"
#Calculate the Residual Predictive Deviation (RPD)

# Calculate standard deviation of observed values
sd_observed <- sd(data_test_uvvis_reg$perc_adulter)

#Calculate RPD
RPD <- sd_observed/RMSE

print(paste('The RPD value of the RF UV-VIS model is',round(RPD,2)))
[1] "The RPD value of the RF UV-VIS model is 3.16"
#### Testing GC-MS kNN Regression Models


##Use the optimal model to predict the external samples

#Predict the test set from GC-MS

prediction_rfr_gc<-predict(fit_rf_gc_reg,newdata = data_test_gc_reg)

# Evaluate the model's performance with RMSE and R²
results_rfr_gc <- postResample(prediction_rfr_gc,data_test_gc_reg$perc_adulter)
print(results_rfr_gc)
     RMSE  Rsquared       MAE 
6.5838481 0.9455823 4.1005684 
RMSE<-results_rfr_gc[1]
Rsquared<-results_rfr_gc[2]
print(paste('The RMSE is',round(RMSE,4), 'and the RSquared is', round(Rsquared,4)))
[1] "The RMSE is 6.5838 and the RSquared is 0.9456"
#Calculate the Residual Predictive Deviation (RPD)

# Calculate standard deviation of observed values
sd_observed <- sd(data_test_gc_reg$perc_adulter)

#Calculate RPD
RPD <- sd_observed/RMSE

print(paste('The RPD value of the RF GC-MS model is',round(RPD,2)))
[1] "The RPD value of the RF GC-MS model is 4.23"

Support Vector Machine Regression (Radial Basis)

#Register cluster for caret to train the models in parallel
cl<-makeCluster(6,type = "SOCK")
suppressWarnings(suppressMessages(
  registerDoSNOW(cl)))

#start_time
start_time<-Sys.time()

#Set up tuneLength for optimal Cost and sigma for the RBF kernel

#Train SVM  models for all the techniques

#HSI SVM Regression model
fit_svm_hsi_reg<-train(perc_adulter~.,data=data_train_hsi_reg,
                       method = "svmRadial",tuneLength = 10,trControl = control_r,metric = "RMSE")

#Raman SVM Regression model
fit_svm_raman_reg<-train(perc_adulter~.,data=data_train_raman_reg,
                         method = "svmRadial",tuneLength = 10,trControl = control_r,metric = "RMSE")

#FTIR SVM Regression model
fit_svm_ftir_reg<-train(perc_adulter~.,data=data_train_ftir_reg,
                        method = "svmRadial",tuneLength = 10,trControl = control_r,metric = "RMSE")

#UV-Vis SVM Regression model
fit_svm_uvvis_reg<-train(perc_adulter~.,data=data_train_uvvis_reg,
                         method = "svmRadial",tuneLength = 10,trControl = control_r,metric = "RMSE")
#GC-MS SVM Regression model
fit_svm_gc_reg<-train(perc_adulter~.,data=data_train_gc_reg,
                      method = "svmRadial",tuneLength = 10,trControl = control_r,metric = "RMSE")
#End_time
end_time<-Sys.time()
model_training_time<-end_time-start_time
stopCluster(cl)#stop the parallel run cluster
### Plot Hyperparameter for SVM Regression Model Selection

#HSI SVM Regression CV Plot
p1_svmr_hsi<-ggplot(fit_svm_hsi_reg)+geom_line(colour = "red")+ 
  theme_bw()+
  theme(
    panel.grid = element_blank())+
  labs(title ='HSI SVM Regression Model Training',y = "RMSECV")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.title = element_text(size =8),
        axis.text = element_text(colour = "black"),
        aspect.ratio = 1)

#Raman SVM Regression CV Plot
p2_svmr_raman<-ggplot(fit_svm_raman_reg)+geom_line(colour = "blue")+ 
  theme_bw()+
  theme(
    panel.grid = element_blank())+
  labs(title ='Raman SVM Regression Model Training',y = "RMSECV")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.title = element_text(size =8),
        axis.text = element_text(colour = "black",size = 8),
        aspect.ratio = 1)

#FTIR SVM Regression CV Plot
p3_svmr_ftir<-ggplot(fit_svm_ftir_reg)+geom_line(colour = "black")+ 
  theme_bw()+
  theme(
    panel.grid = element_blank())+
  labs(title ='FTIR SVM Regression Model Training', y = "RMSECV")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.title = element_text(size =8),
        axis.text = element_text(colour = "black",size = 8),
        aspect.ratio = 1)

#UV-Vis SVM Regression CV Plot
p4_svmr_uvvis<-ggplot(fit_svm_uvvis_reg)+geom_line(colour = "black",linetype = 'dashed')+ 
  theme_bw()+geom_point(pch = 4)+
  theme(
    panel.grid = element_blank())+
  labs(title ='UV-Vis SVM Regression Model Training',y = "RMSECV")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.title = element_text(size =8),
        axis.text = element_text(colour = "black",size =8),
        aspect.ratio = 1)
#Arrange the SVM CV Plots
gridExtra::grid.arrange(p1_svmr_hsi,p2_svmr_raman,p3_svmr_ftir,p4_svmr_uvvis,nrow = 2)

#GC SVM Regression CV Plot
ggplot(fit_svm_gc_reg)+geom_line(colour = "black",linetype = 'dashed')+ 
  theme_bw()+
  theme(
    panel.grid = element_blank())+
  labs(title ='GC-MS SVM Regression Model Training',y = "RMSECV")+
  theme(plot.title = element_text(hjust = 0.5,size = 8),
        axis.title = element_text(size =8),
        axis.text = element_text(colour = "black",size =8),
        aspect.ratio = 1)

Display cross-validation results for SVM Regression CV models
#HSI CV results
print(paste('The optimal number of parameters for training the HSI-SVMR regression model are',fit_svm_hsi_reg$bestTune$C, "as Cost", "and the sigma as",round(fit_svm_hsi_reg$bestTune$sigma,2)))
[1] "The optimal number of parameters for training the HSI-SVMR regression model are 2 as Cost and the sigma as 0.34"
#Output table
knitr::kable(fit_svm_hsi_reg$results)
sigma C RMSE Rsquared MAE RMSESD RsquaredSD MAESD
0.3364615 0.25 21.314019 0.8448900 10.508693 8.111743 0.1075002 4.133116
0.3364615 0.50 14.998223 0.8893501 8.223880 5.593008 0.1487399 2.991646
0.3364615 1.00 8.581058 0.9031577 5.536017 2.805123 0.1496241 1.649994
0.3364615 2.00 7.697111 0.9067564 5.035309 2.599850 0.1436639 1.502324
0.3364615 4.00 7.671983 0.9089699 5.000331 2.596149 0.1396544 1.494979
0.3364615 8.00 7.663693 0.9095692 4.990817 2.597545 0.1388471 1.495080
0.3364615 16.00 7.663693 0.9095692 4.990817 2.597545 0.1388471 1.495080
0.3364615 32.00 7.663693 0.9095692 4.990817 2.597545 0.1388471 1.495080
0.3364615 64.00 7.663693 0.9095692 4.990817 2.597545 0.1388471 1.495080
0.3364615 128.00 7.663693 0.9095692 4.990817 2.597545 0.1388471 1.495080
#The optimal selected model
selected_model<-fit_svm_hsi_reg$results %>% filter(sigma == as.numeric(fit_svm_hsi_reg$bestTune$sigma) &  C == as.numeric(fit_svm_hsi_reg$bestTune$C))
knitr::kable(selected_model)
sigma C RMSE Rsquared MAE RMSESD RsquaredSD MAESD
0.3364615 2 7.697111 0.9067564 5.035309 2.59985 0.1436639 1.502324
#Raman CV results
print(paste('The optimal number of parameters for training the raman-SVMR regression model are',fit_svm_raman_reg$bestTune$C, "as Cost", "and the sigma as",round(fit_svm_raman_reg$bestTune$sigma,2)))
[1] "The optimal number of parameters for training the raman-SVMR regression model are 4 as Cost and the sigma as 0.24"
#Output table
knitr::kable(fit_svm_raman_reg$results)
sigma C RMSE Rsquared MAE RMSESD RsquaredSD MAESD
0.2370377 0.25 19.412439 0.8882934 9.980097 7.430909 0.1098680 3.696266
0.2370377 0.50 12.404719 0.9129798 6.972305 5.117112 0.1032977 2.544765
0.2370377 1.00 8.436821 0.9217988 5.265168 3.427013 0.1033118 1.722582
0.2370377 2.00 6.922290 0.9288537 4.679239 2.564651 0.0987711 1.391857
0.2370377 4.00 6.384150 0.9314740 4.431318 2.406098 0.0982424 1.348055
0.2370377 8.00 6.364950 0.9341756 4.416164 2.411270 0.0913522 1.322526
0.2370377 16.00 6.349667 0.9350975 4.412722 2.417454 0.0900345 1.312623
0.2370377 32.00 6.349667 0.9350975 4.412722 2.417454 0.0900345 1.312623
0.2370377 64.00 6.349667 0.9350975 4.412722 2.417454 0.0900345 1.312623
0.2370377 128.00 6.349667 0.9350975 4.412722 2.417454 0.0900345 1.312623
#The optimal selected model
selected_model<-fit_svm_raman_reg$results %>% filter(sigma == as.numeric(fit_svm_raman_reg$bestTune$sigma) &  C == as.numeric(fit_svm_raman_reg$bestTune$C))
knitr::kable(selected_model)
sigma C RMSE Rsquared MAE RMSESD RsquaredSD MAESD
0.2370377 4 6.38415 0.931474 4.431318 2.406098 0.0982424 1.348055
#FTIR CV results
print(paste('The optimal number of parameters for training the ftir-SVMR regression model are',fit_svm_ftir_reg$bestTune$C, "as Cost", "and the sigma as",round(fit_svm_ftir_reg$bestTune$sigma,2)))
[1] "The optimal number of parameters for training the ftir-SVMR regression model are 2 as Cost and the sigma as 0.56"
#Output table
knitr::kable(fit_svm_ftir_reg$results)
sigma C RMSE Rsquared MAE RMSESD RsquaredSD MAESD
0.5559264 0.25 24.89624 0.4357231 13.41466 8.525613 0.3411522 4.082841
0.5559264 0.50 20.92767 0.4916098 11.85407 9.680395 0.3308904 4.110327
0.5559264 1.00 19.53131 0.5262202 11.08372 9.670034 0.3211408 4.208523
0.5559264 2.00 18.87294 0.5526699 10.98674 8.963510 0.3135884 4.091748
0.5559264 4.00 18.64195 0.5551109 11.07756 8.695847 0.3141387 4.134350
0.5559264 8.00 18.91576 0.5469236 11.31231 8.481421 0.3113037 4.138981
0.5559264 16.00 18.69484 0.5602143 11.39689 8.154220 0.3093569 4.111556
0.5559264 32.00 18.23479 0.5795828 11.30960 7.932539 0.3085868 4.080821
0.5559264 64.00 18.39975 0.5717083 11.48637 7.891773 0.3088468 4.119754
0.5559264 128.00 19.10990 0.5541784 11.90817 8.528341 0.3146987 4.530861
#The optimal selected model
selected_model<-fit_svm_ftir_reg$results %>% filter(sigma == as.numeric(fit_svm_ftir_reg$bestTune$sigma) &  C == as.numeric(fit_svm_ftir_reg$bestTune$C))
knitr::kable(selected_model)
sigma C RMSE Rsquared MAE RMSESD RsquaredSD MAESD
0.5559264 2 18.87294 0.5526699 10.98674 8.96351 0.3135884 4.091748
#UV-Vis CV results
print(paste('The optimal number of parameters for training the uvvis-SVMR regression model are',fit_svm_uvvis_reg$bestTune$C, "as Cost", "and the sigma as",round(fit_svm_uvvis_reg$bestTune$sigma,2)))
[1] "The optimal number of parameters for training the uvvis-SVMR regression model are 4 as Cost and the sigma as 0.5"
#Output table
knitr::kable(fit_svm_uvvis_reg$results)
sigma C RMSE Rsquared MAE RMSESD RsquaredSD MAESD
0.5044782 0.25 17.550287 0.8678025 9.204258 6.996607 0.1577245 3.581977
0.5044782 0.50 9.966697 0.8777472 6.079242 5.856287 0.1767953 2.545087
0.5044782 1.00 8.305596 0.8881196 5.285365 4.680856 0.1834046 2.110208
0.5044782 2.00 6.856065 0.9046007 4.740128 3.238676 0.1856198 1.643505
0.5044782 4.00 6.112556 0.9105191 4.347223 3.205053 0.1779902 1.604812
0.5044782 8.00 6.169090 0.9099381 4.367331 3.222847 0.1741668 1.628701
0.5044782 16.00 6.346046 0.9079968 4.485181 3.261402 0.1709321 1.652186
0.5044782 32.00 6.578040 0.9046800 4.661526 3.313788 0.1690744 1.699031
0.5044782 64.00 6.681607 0.9031608 4.773328 3.352329 0.1695398 1.760212
0.5044782 128.00 6.657800 0.9042560 4.775030 3.334042 0.1683198 1.753817
#The optimal selected model
selected_model<-fit_svm_uvvis_reg$results %>% filter(sigma == as.numeric(fit_svm_uvvis_reg$bestTune$sigma) &  C == as.numeric(fit_svm_uvvis_reg$bestTune$C))
knitr::kable(selected_model)
sigma C RMSE Rsquared MAE RMSESD RsquaredSD MAESD
0.5044782 4 6.112556 0.9105191 4.347223 3.205053 0.1779902 1.604812
#GC-MS CV results
print(paste('The optimal number of parameters for training the gc-SVMR regression model are',fit_svm_gc_reg$bestTune$C, "as Cost", "and the sigma as",round(fit_svm_gc_reg$bestTune$sigma,2)))
[1] "The optimal number of parameters for training the gc-SVMR regression model are 2 as Cost and the sigma as 0.42"
#Output table
knitr::kable(fit_svm_gc_reg$results)
sigma C RMSE Rsquared MAE RMSESD RsquaredSD MAESD
0.4166362 0.25 16.244904 0.8529294 9.129134 6.747479 0.1387116 3.289797
0.4166362 0.50 10.532421 0.8730817 6.184622 5.543138 0.1315507 2.423261
0.4166362 1.00 7.892077 0.9061402 4.946972 4.154306 0.1242764 1.851727
0.4166362 2.00 6.337594 0.9252535 4.357740 2.667780 0.1196483 1.339910
0.4166362 4.00 6.312303 0.9244507 4.363282 2.619833 0.1224626 1.328557
0.4166362 8.00 6.340399 0.9228565 4.394780 2.592648 0.1273435 1.301837
0.4166362 16.00 6.416506 0.9204045 4.464199 2.546279 0.1293524 1.262973
0.4166362 32.00 6.564763 0.9158229 4.569878 2.514977 0.1356340 1.226159
0.4166362 64.00 6.848549 0.9075719 4.780491 2.508786 0.1475748 1.226953
0.4166362 128.00 7.203700 0.8993609 5.059140 2.533273 0.1574666 1.282061
#The optimal selected model
selected_model<-fit_svm_gc_reg$results %>% filter(sigma == as.numeric(fit_svm_gc_reg$bestTune$sigma) &  C == as.numeric(fit_svm_gc_reg$bestTune$C))
knitr::kable(selected_model)
sigma C RMSE Rsquared MAE RMSESD RsquaredSD MAESD
0.4166362 2 6.337594 0.9252535 4.35774 2.66778 0.1196483 1.33991

Testing HSI SVM Regression Models

##Use the optimal model to predict the external samples

#Predict the test set from HSI

prediction_svmr_hsi<-predict(fit_svm_hsi_reg,newdata = data_test_hsi_reg)

# Evaluate the model's performance with RMSE and R²
results_svmr_hsi <- postResample(prediction_svmr_hsi,data_test_hsi_reg$perc_adulter)
print(results_svmr_hsi)
     RMSE  Rsquared       MAE 
9.9081150 0.9051078 5.0236396 
RMSE<-results_svmr_hsi[1]
Rsquared<-results_svmr_hsi[2]
print(paste('The RMSE is',round(RMSE,4), 'and the RSquared is', round(Rsquared,4)))
[1] "The RMSE is 9.9081 and the RSquared is 0.9051"
#Calculate the Residual Predictive Deviation (RPD)

# Calculate standard deviation of observed values
sd_observed <- sd(data_test_hsi_reg$perc_adulter)

#Calculate RPD
RPD <- sd_observed/RMSE

print(paste('The RPD value of the SVM HSI model is',round(RPD,2)))
[1] "The RPD value of the SVM HSI model is 2.59"
#### Testing Raman SVM Regression Models


##Use the optimal model to predict the external samples

#Predict the test set from Raman

prediction_svmr_raman<-predict(fit_svm_raman_reg,newdata = data_test_raman_reg)

# Evaluate the model's performance with RMSE and R²
results_svmr_raman <- postResample(prediction_svmr_raman,data_test_raman_reg$perc_adulter)
print(results_svmr_raman)
     RMSE  Rsquared       MAE 
6.9072654 0.9350547 4.1562377 
RMSE<-results_svmr_raman[1]
Rsquared<-results_svmr_raman[2]
print(paste('The RMSE is',round(RMSE,4), 'and the RSquared is', round(Rsquared,4)))
[1] "The RMSE is 6.9073 and the RSquared is 0.9351"
#Calculate the Residual Predictive Deviation (RPD)

# Calculate standard deviation of observed values
sd_observed <- sd(data_test_raman_reg$perc_adulter)

#Calculate RPD
RPD <- sd_observed/RMSE

print(paste('The RPD value of the SVM Raman model is',round(RPD,2)))
[1] "The RPD value of the SVM Raman model is 3.72"
#### Testing FTIR SVM Regression Models


##Use the optimal model to predict the external samples

#Predict the test set from FTIR

prediction_svmr_ftir<-predict(fit_svm_ftir_reg,newdata = data_test_ftir_reg)

# Evaluate the model's performance with RMSE and R²
results_svmr_ftir <- postResample(prediction_svmr_ftir,data_test_ftir_reg$perc_adulter)
print(results_svmr_ftir)
      RMSE   Rsquared        MAE 
10.5530671  0.8802338  8.0161611 
RMSE<-results_svmr_ftir[1]
Rsquared<-results_svmr_ftir[2]
print(paste('The RMSE is',round(RMSE,4), 'and the RSquared is', round(Rsquared,4)))
[1] "The RMSE is 10.5531 and the RSquared is 0.8802"
#Calculate the Residual Predictive Deviation (RPD)

# Calculate standard deviation of observed values
sd_observed <- sd(data_test_ftir_reg$perc_adulter)

#Calculate RPD
RPD <- sd_observed/RMSE

print(paste('The RPD value of the SVM FTIR model is',round(RPD,2)))
[1] "The RPD value of the SVM FTIR model is 2.44"
#### Testing UV-Vis SVM Regression Models


##Use the optimal model to predict the external samples

#Predict the test set from UV-Vis

prediction_svmr_uvvis<-predict(fit_svm_uvvis_reg,newdata = data_test_uvvis_reg)

# Evaluate the model's performance with RMSE and R²
results_svmr_uvvis <- postResample(prediction_svmr_uvvis,data_test_uvvis_reg$perc_adulter)
print(results_svmr_uvvis)
     RMSE  Rsquared       MAE 
6.3403737 0.9531194 3.8885423 
RMSE<-results_svmr_uvvis[1]
Rsquared<-results_svmr_uvvis[2]
print(paste('The RMSE is',round(RMSE,4), 'and the RSquared is', round(Rsquared,4)))
[1] "The RMSE is 6.3404 and the RSquared is 0.9531"
#Calculate the Residual Predictive Deviation (RPD)

# Calculate standard deviation of observed values
sd_observed <- sd(data_test_uvvis_reg$perc_adulter)

#Calculate RPD
RPD <- sd_observed/RMSE

print(paste('The RPD value of the SVM UV-VIS model is',round(RPD,2)))
[1] "The RPD value of the SVM UV-VIS model is 4.05"
#### Testing GC-MS SVM Regression Models


##Use the optimal model to predict the external samples

#Predict the test set from GC-MS

prediction_svmr_gc<-predict(fit_svm_gc_reg,newdata = data_test_gc_reg)

# Evaluate the model's performance with RMSE and R²
results_svmr_gc <- postResample(prediction_svmr_gc,data_test_gc_reg$perc_adulter)
print(results_svmr_gc)
     RMSE  Rsquared       MAE 
5.4302528 0.9718888 3.6375409 
RMSE<-results_svmr_gc[1]
Rsquared<-results_svmr_gc[2]
print(paste('The RMSE is',round(RMSE,4), 'and the RSquared is', round(Rsquared,4)))
[1] "The RMSE is 5.4303 and the RSquared is 0.9719"
#Calculate the Residual Predictive Deviation (RPD)

# Calculate standard deviation of observed values
sd_observed <- sd(data_test_gc_reg$perc_adulter)

#Calculate RPD
RPD <- sd_observed/RMSE

print(paste('The RPD value of the SVM GC model is',round(RPD,2)))
[1] "The RPD value of the SVM GC model is 5.13"

Multiple Linear Regression/Principal Component Regression

#Register cluster for caret to train the models in parallel
cl<-makeCluster(6,type = "SOCK")
suppressWarnings(suppressMessages(
  registerDoSNOW(cl)))

#start_time
start_time<-Sys.time()

#Train PCR  models for all the techniques

#HSI PC Regression model
fit_lm_hsi_reg<-train(perc_adulter~.,data=data_train_hsi_reg,
                      method = "lm",trControl = control_r,metric = "RMSE")

#Raman PC Regression model
fit_lm_raman_reg<-train(perc_adulter~.,data=data_train_raman_reg,
                        method = "lm",trControl = control_r,metric = "RMSE")

#FTIR PC Regression model
fit_lm_ftir_reg<-train(perc_adulter~.,data=data_train_ftir_reg,
                       method = "lm",trControl = control_r,metric = "RMSE")

#UV-Vis PC Regression model
fit_lm_uvvis_reg<-train(perc_adulter~.,data=data_train_uvvis_reg,
                        method = "lm",trControl = control_r,metric = "RMSE")
#GC-MS PC Regression model
fit_lm_gc_reg<-train(perc_adulter~.,data=data_train_gc_reg,
                     method = "lm",trControl = control_r,metric = "RMSE")
#End_time
end_time<-Sys.time()
model_training_time<-end_time-start_time
stopCluster(cl)#stop the parallel run cluster
Display cross-validation results for MLR/PCR CV models
#HSI CV results
knitr::kable(fit_lm_hsi_reg$results)
intercept RMSE Rsquared MAE RMSESD RsquaredSD MAESD
TRUE 6.131016 0.9497725 4.284404 2.099634 0.073126 1.188943
#Raman CV results
knitr::kable(fit_lm_raman_reg$results)
intercept RMSE Rsquared MAE RMSESD RsquaredSD MAESD
TRUE 9.745638 0.9034262 5.983139 4.108713 0.0892152 1.846991
#FTIR CV results
knitr::kable(fit_lm_ftir_reg$results)
intercept RMSE Rsquared MAE RMSESD RsquaredSD MAESD
TRUE 20.19309 0.4915514 14.2014 6.31092 0.2759023 3.722209
#UV-VIS CV results
knitr::kable(fit_lm_uvvis_reg$results)
intercept RMSE Rsquared MAE RMSESD RsquaredSD MAESD
TRUE 10.80195 0.8307609 7.667385 2.961134 0.1737829 1.989256
#GC CV results
knitr::kable(fit_lm_gc_reg$results)
intercept RMSE Rsquared MAE RMSESD RsquaredSD MAESD
TRUE 7.65271 0.9304674 5.144659 3.094408 0.0885246 1.359751

Testing HSI MLR/PC Regression Models

##Use the optimal model to predict the external samples

#Predict the test set from HSI

prediction_lmr_hsi<-predict(fit_lm_hsi_reg,newdata = data_test_hsi_reg)

# Evaluate the model's performance with RMSE and R²
results_lmr_hsi <- postResample(prediction_lmr_hsi,data_test_hsi_reg$perc_adulter)
print(results_lmr_hsi)
     RMSE  Rsquared       MAE 
6.3114105 0.9513491 4.2512485 
RMSE<-results_lmr_hsi[1]
Rsquared<-results_lmr_hsi[2]
print(paste('The RMSE is',round(RMSE,4), 'and the RSquared is', round(Rsquared,4)))
[1] "The RMSE is 6.3114 and the RSquared is 0.9513"
#Calculate the Residual Predictive Deviation (RPD)

# Calculate standard deviation of observed values
sd_observed <- sd(data_test_hsi_reg$perc_adulter)

#Calculate RPD
RPD <- sd_observed/RMSE

print(paste('The RPD value of the lm HSI model is',round(RPD,2)))
[1] "The RPD value of the lm HSI model is 4.07"
#### Testing Ramna MLR/PC Regression Models


##Use the optimal model to predict the external samples

#Predict the test set from Raman

prediction_lmr_raman<-predict(fit_lm_raman_reg,newdata = data_test_raman_reg)

# Evaluate the model's performance with RMSE and R²
results_lmr_raman <- postResample(prediction_lmr_raman,data_test_raman_reg$perc_adulter)
print(results_lmr_raman)
    RMSE Rsquared      MAE 
8.413590 0.908793 5.118624 
RMSE<-results_lmr_raman[1]
Rsquared<-results_lmr_raman[2]
print(paste('The RMSE is',round(RMSE,4), 'and the RSquared is', round(Rsquared,4)))
[1] "The RMSE is 8.4136 and the RSquared is 0.9088"
#Calculate the Residual Predictive Deviation (RPD)

# Calculate standard deviation of observed values
sd_observed <- sd(data_test_raman_reg$perc_adulter)

#Calculate RPD
RPD <- sd_observed/RMSE

print(paste('The RPD value of the lm raman model is',round(RPD,2)))
[1] "The RPD value of the lm raman model is 3.05"
#### Testing FTIR MLR/PC Regression Models


##Use the optimal model to predict the external samples

#Predict the test set from FTIR

prediction_lmr_ftir<-predict(fit_lm_ftir_reg,newdata = data_test_ftir_reg)

# Evaluate the model's performance with RMSE and R²
results_lmr_ftir <- postResample(prediction_lmr_ftir,data_test_ftir_reg$perc_adulter)
print(results_lmr_ftir)
      RMSE   Rsquared        MAE 
15.1114948  0.6499046 10.9339001 
RMSE<-results_lmr_ftir[1]
Rsquared<-results_lmr_ftir[2]
print(paste('The RMSE is',round(RMSE,4), 'and the RSquared is', round(Rsquared,4)))
[1] "The RMSE is 15.1115 and the RSquared is 0.6499"
#Calculate the Residual Predictive Deviation (RPD)

# Calculate standard deviation of observed values
sd_observed <- sd(data_test_ftir_reg$perc_adulter)

#Calculate RPD
RPD <- sd_observed/RMSE

print(paste('The RPD value of the lm ftir model is',round(RPD,2)))
[1] "The RPD value of the lm ftir model is 1.7"
#### Testing uvvis MLR/PC Regression Models


##Use the optimal model to predict the external samples

#Predict the test set from UVVIS

prediction_lmr_uvvis<-predict(fit_lm_uvvis_reg,newdata = data_test_hsi_reg)

# Evaluate the model's performance with RMSE and R²
results_lmr_uvvis <- postResample(prediction_lmr_uvvis,data_test_uvvis_reg$perc_adulter)
print(results_lmr_uvvis)
      RMSE   Rsquared        MAE 
55.1500782  0.4001719 37.2078424 
RMSE<-results_lmr_uvvis[1]
Rsquared<-results_lmr_uvvis[2]
print(paste('The RMSE is',round(RMSE,4), 'and the RSquared is', round(Rsquared,4)))
[1] "The RMSE is 55.1501 and the RSquared is 0.4002"
#Calculate the Residual Predictive Deviation (RPD)

# Calculate standard deviation of observed values
sd_observed <- sd(data_test_uvvis_reg$perc_adulter)

#Calculate RPD
RPD <- sd_observed/RMSE

print(paste('The RPD value of the lm uvvis model is',round(RPD,2)))
[1] "The RPD value of the lm uvvis model is 0.47"
#### Testing GC-MS MLR/PC Regression Models


##Use the optimal model to predict the external samples

#Predict the test set from GC

prediction_lmr_gc<-predict(fit_lm_gc_reg,newdata = data_test_gc_reg)

# Evaluate the model's performance with RMSE and R²
results_lmr_gc <- postResample(prediction_lmr_gc,data_test_gc_reg$perc_adulter)
print(results_lmr_gc)
     RMSE  Rsquared       MAE 
8.1484614 0.9177609 4.5855990 
RMSE<-results_lmr_gc[1]
Rsquared<-results_lmr_gc[2]
print(paste('The RMSE is',round(RMSE,4), 'and the RSquared is', round(Rsquared,4)))
[1] "The RMSE is 8.1485 and the RSquared is 0.9178"
#Calculate the Residual Predictive Deviation (RPD)

# Calculate standard deviation of observed values
sd_observed <- sd(data_test_gc_reg$perc_adulter)

#Calculate RPD
RPD <- sd_observed/RMSE

print(paste('The RPD value of the lm gc model is',round(RPD,2)))
[1] "The RPD value of the lm gc model is 3.42"

Conclusion

  • The project demonstrates the potential of hyperspectral imaging (HSI) as a fast and effective method for detecting fraud in olive oil. When combined with chemometrics and machine learning models, this approach consistently outperforms other techniques, achieving an accuracy of up to 100% in distinguishing between pure and adulterated olive oil. Furthermore, the models accurately predict the concentration of contaminant oils, achieving a low RMSE of 5% on the test set.

  • Notably, the HSI-based model achieved an RPD above 3, which is indicative of excellent predictive performance. An RPD value greater than 3 implies that the model can reliably differentiate between varying levels of adulteration, underscoring its robustness and effectiveness for detecting fraud in olive oil.